Disk Scheduling In Linux Its really quite complex

  • Slides: 32
Download presentation
Disk Scheduling In Linux It’s really quite complex!

Disk Scheduling In Linux It’s really quite complex!

My Goals • Teach a little bit of Computer Science. • Show that the

My Goals • Teach a little bit of Computer Science. • Show that the easy stuff is hard in real life. • BTW, Operating systems are cool!

How A Disk Works • A disk is a bunch of data blocks. •

How A Disk Works • A disk is a bunch of data blocks. • A disk has a disk head. • Data can only be accessed when the head is on that block. • Moving the head takes FOREVER. • Data transfer is fast (once the disk head gets there).

The Problem Statement • Suppose we have multiple requests for disk blocks …. Which

The Problem Statement • Suppose we have multiple requests for disk blocks …. Which should we access first? P. S. Yes, order does matter … a lot.

Formal Problem Statement • Input: A set of requests. • Input: The current disk

Formal Problem Statement • Input: A set of requests. • Input: The current disk head location. • Input: Algorithm state (direction? ? ) • Output: The next disk block to get. P. S. Not the whole ordering, just the next one.

The Goals • Maximize throughput – Operations per second. • Maximize fairness – Whatever

The Goals • Maximize throughput – Operations per second. • Maximize fairness – Whatever that means. • Avoid Starvation – And very long waits. • Real Time Concerns – These can be life threatening (and bank accounts too!).

What’s Not Done • Every Operating system assigns priorities to all sorts of things

What’s Not Done • Every Operating system assigns priorities to all sorts of things – Requests for RAM – Requests for the CPU – Requests for network access • Few use disk request priority.

Why Talk About This Now? • Because in the last year Linux has had

Why Talk About This Now? • Because in the last year Linux has had three very different schedulers, and they’ve been tested against each other.

A Pessimal Algorithm • Choose the disk request furthest from the current disk head.

A Pessimal Algorithm • Choose the disk request furthest from the current disk head. • This is known to be as bad as any algorithm without idle periods.

An Optimal Algorithm • Chose the disk request closest to the disk head. •

An Optimal Algorithm • Chose the disk request closest to the disk head. • It’s Optimal!

Optimal Analyzed • This is known to have the highest performance in operations per

Optimal Analyzed • This is known to have the highest performance in operations per second. • It’s unfair to requests toward the edges of the disk. • It allows for starvation.

First Come First Serve • Serve the requests in their arrival order. – It’s

First Come First Serve • Serve the requests in their arrival order. – It’s fair. – It avoid starvation. – It’s medium lousy performance. – Some simple OSs use this.

Elevator • Move back and forth, solving requests as you go. • Performance –

Elevator • Move back and forth, solving requests as you go. • Performance – good • Fairness – Files near the middle of the disk get 2 x the attention.

Cyclic Elevator • Move toward the bottom, solving requests as you go. • When

Cyclic Elevator • Move toward the bottom, solving requests as you go. • When you’ve solved the lowest request, seek all the way to the highest request. – Performance penalty occurs here.

Cyclic Elevator Analyzed • It’s fair. • It’s starvation-proof. • It’s very good performance.

Cyclic Elevator Analyzed • It’s fair. • It’s starvation-proof. • It’s very good performance. – Almost as good as elevator. • It’s used in real life (and every textbook).

Deadline Scheduler • Each request has a Jens Axboe deadline. • Service requests using

Deadline Scheduler • Each request has a Jens Axboe deadline. • Service requests using cyclic elevator. • When a deadline is threatened, skip directly to that request. • For Real Time (which means xmms)

Deadline Analyzed • Gives Priority to Real Time Processes. • Fair otherwise. • No

Deadline Analyzed • Gives Priority to Real Time Processes. • Fair otherwise. • No starvation – Unless a real time process goes wild.

Anticipatory Scheduling (The Idea) • Developed by several people. • Coded by Nick Piggin.

Anticipatory Scheduling (The Idea) • Developed by several people. • Coded by Nick Piggin. • Assume that an I/O request will be closely followed by another nearby one.

Anticipatory Scheduling (The Algorithm) • After servicing a request … WAIT. – Yes, this

Anticipatory Scheduling (The Algorithm) • After servicing a request … WAIT. – Yes, this means do nothing even though there is work to be done. • If a nearby request occurs soon, service it. • If not … cyclic elevator.

Anticiptory Scheduling Analyzed • • Fair No support for real time. No starvation. Makes

Anticiptory Scheduling Analyzed • • Fair No support for real time. No starvation. Makes assumptions about how processes work in real life. – That’s the idleness. – They better be right

Benchmarking the Anticipatory Scheduler Source: http: //www. cs. rice. edu/~ssiyer/r/antsched. pdf

Benchmarking the Anticipatory Scheduler Source: http: //www. cs. rice. edu/~ssiyer/r/antsched. pdf

Completely Fair Queuing (also by Jens) • Real Time needs always come first •

Completely Fair Queuing (also by Jens) • Real Time needs always come first • Otherwise, no user should be able to hog the disk drive. • Priorities are OK.

The CFQ Picture RT Q 1 Dispatcher 10 ms Valve Disk Queue Q 2

The CFQ Picture RT Q 1 Dispatcher 10 ms Valve Disk Queue Q 2 Disk Q 20 Yes, Gabe’s art is better

Analyzing CFQ • Complex!!!!! • Has Real Time support (Jens likes that). • Fair,

Analyzing CFQ • Complex!!!!! • Has Real Time support (Jens likes that). • Fair, and fights disk hogs! – A new kind of fairness!! • No starvation is possible. – Real time crazyness excepted. • Allows for priorities – But no one knows how to assign them.

Benchmark #1 • time (find kernel-tree -type f | xargs cat > /dev/null) Dead:

Benchmark #1 • time (find kernel-tree -type f | xargs cat > /dev/null) Dead: 3 minutes 39 seconds CFQ: 5 minutes 7 seconds AS: 17 seconds

The Benchmark #2 for i in 1 2 3 4 5 6 do time

The Benchmark #2 for i in 1 2 3 4 5 6 do time (find kernel-tree-$i -type f | xargs cat > /dev/null ) & done Dead: 3 m 56. 791 s CFQ: 5 m 50. 233 s AS: 0 m 53. 087 s

The Benchmark #3 time (cp 1 -gig-file foo ; sync) AS: 1: 22. 36

The Benchmark #3 time (cp 1 -gig-file foo ; sync) AS: 1: 22. 36 CFQ: 1: 25. 54 Dead: 1: 11. 03

Benchmark #4 • time ssh testbox xterm -e true Old: 62 seconds Dead: 14

Benchmark #4 • time ssh testbox xterm -e true Old: 62 seconds Dead: 14 seconds CFQ: 11 seconds AS: 12 seconds

Benchmark #5 • While “cat 512 M-file > /dev/null “ • Measure “write-and-fsync -f

Benchmark #5 • While “cat 512 M-file > /dev/null “ • Measure “write-and-fsync -f -m 100 outfile” Old: 6. 4 seconds Dead: 7. 7 seconds CFQ: 8. 4 seconds AS: 11. 9 seconds

The Winner is … • Andrew Morton said, "the anticipatory scheduler is wiping the

The Winner is … • Andrew Morton said, "the anticipatory scheduler is wiping the others off the map, and 2. 4 is a disaster. "

What You Learned • In Real Operating Systems … – Performance is never obvious.

What You Learned • In Real Operating Systems … – Performance is never obvious. – No one uses the textbook algorithm. – Benchmarking is everything. • Theory is useful, if it helps you benchmark better. • Linux is cool, since we can see the development process.