Outline OS schedulers Unix scheduling Linux 2 4
- Slides: 75
Outline • • • OS schedulers Unix scheduling Linux 2. 4 scheduler Linux 2. 6 scheduler – O(1) scheduler – O(2) scheduler 2
5/31 ~ 一 5/31 6/7 6/14 6/21 6/28 7/5 二 6/1 6/8 6/15 6/22 6/29 7/6 三 6/2 6/9 6/16 6/23 6/30 7/7 四 6/3 6/10 6/17 6/24 7/1 7/8 五 6/4 6/11 6/18 6/25 7/2 7/9 六 6/5 6/12 6/19 6/26 7/3 7/10 日 6/6 6/13 6/20 6/27 7/4 7/11 3
Introduction preemptive & cooperative multitasking • A multitasking operating system is one that can simultaneously interleave execution of more than one process. • Multitasking operating systems come in two flavors: cooperative multitasking and preemptive multitasking. – Linux provides preemptive multitasking – MAC OS 9 and earlier being the most notable cooperative multitasking. 4
Linux scheduler – Scheduling Policy • Scheduling policy determines what runs when – fast process response time (low latency) – maximal system utilization (high throughput) • Processes classification: – I/O-bound processes: spends much of its time submitting and waiting on I/O requests – Processor-bound processes: spend much of their time executing code • Unix variants tends to favor I/O-bound processes, thus providing good process response time 5
Linux scheduler – Process Priority • Linux’s priority-based scheduling – Rank processes based on their worth and need for processor time. – Both the user and the system may set a process's priority to influence the scheduling behavior of the system. • Dynamic priority-based scheduling – Begins with an initial base priority – Then enables the scheduler to increase or decrease the priority dynamically to fulfill scheduling objectives. – E. g. , a process that is spending more time waiting on I/O will receive an elevated dynamic priority. 6
Linux scheduler – Priority Ranges • Two separate priority ranges. – nice value, from -20 to +19 with a default of 0. • Larger nice values correspond to a lower priority. (you are being nice to the other processes on the system). – real-time priority, by default range from 0 to 99. • All real-time processes are at a higher priority than normal processes. • Linux implements real-time priorities in accordance with POSIX standards on the matter. 7
2. 4 scheduler • Non-preemptible kernel – Set p->need_resched if schedule() should be invoked at the ‘next opportunity‘ (kernel => user mode). • Round-robin – task_struct->counter: number of clock ticks left to run in this scheduling slice, decremented by a timer. 8
2. 4 scheduler 1. Check if schedule() was invoked from interrupt handler (due to a bug) and panic if so. 2. Use spin_lock_irq() to lock ‘runqueue_lock’ 3. Check if a task is ‘runnable’ – in TASK_RUNNING state – in TASK_INTERRUPTIBLE state and a signal is pending 4. Examine the ‘goodness’ of each process 5. Context switch 9
2. 4 scheduler – ‘goodness’ • ‘goodness’: identifying the best candidate among all processes in the runqueue list. – ‘goodness’ = 0: the entity has exhausted its quantum. – 0 < ‘goodness’ < 1000: the entity is a conventional process/thread that has not exhausted its quantum; a higher value denotes a higher level of goodness. 10
2. 4 scheduler – ‘goodness’ if (p->mm == prev->mm) return p->counter + p->priority + 1; else return p->counter + p->priority; • A small bonus is given to the task p if it shares the address space with the previous task. 11
2. 4 scheduler - SMP run queue 12
2. 4 scheduler - SMP Examine the processor field of the processes and gives a consistent bonus (that is PROC_CHANGE_PENALTY, usually 15) to the process that was last executed on the ‘this_cpu’ CPU. 13
2. 4 scheduler - performance • The algorithm does not scale well – It is inefficient to re-compute all dynamic priorities at once. • The predefined quantum is too large for high system loads (for example: a server) • I/O-bound process boosting strategy is not optimal – a good strategy to ensure a short response time for interactive programs, but… – some batch programs with almost no user interaction are I/O-bound. 14
Recalculating Timeslices (kernel 2. 4) • Problems: – Can take a long time. Worse, it scales O(n) for n tasks on the system. – Recalculation must occur under some sort of lock protecting the task list and the individual process descriptors. This results in high lock contention. – Nondeterminism is a problem with deterministic realtime programs. 15
2. 6 scheduler run queue task migration (put + pull) run queue 16
2. 6 scheduler – User Preemption • User preemption can occur – When returning to user-space from a system call – When returning to user-space from an interrupt handler 17
2. 6 scheduler – Kernel Preemption • The Linux kernel is a fully preemptive kernel. – It is possible to preempt a task at any point, so long as the kernel is in a state in which it is safe to reschedule. – “safe to reschedule”: kernel does not hold a lock • The Linux design: – additing of a preemption counter, preempt_count, to each process's thread_info – This count increments once for each lock that is acquired and decrements once for each lock that is released • Kernel preemption can also occur explicitly, when a task in the kernel blocks or explicitly calls schedule(). – no additional logic is required to ensure that the kernel is in a state that is safe to preempt! 18
Kernel Preemption • Kernel preemption can occur – When an interrupt handler exits, before returning to kernel-space – When kernel code becomes preemptible again – If a task in the kernel explicitly calls schedule() – If a task in the kernel blocks (which results in a call to schedule()) 19
O(1) & CFS scheduler • 2. 5 ~ 2. 6. 22: O(1) scheduler – Time complexity: O(1) – Using “run queue” (an active Q and an expired Q) to realize the ready queue • 2. 6. 23~present: Completely Fair Scheduler (CFS) – Time complexity: O(log n) – the ready queue is implemented as a red-black tree 20
O(1) scheduler • Implement fully O(1) scheduling. – Every algorithm in the new scheduler completes in constant-time, regardless of the number of running processes. (Since the 2. 5 kernel). • Implement perfect SMP scalability. – Each processor has its own locking and individual runqueue. • Implement improved SMP affinity. – Attempt to group tasks to a specific CPU and continue to run them there. – Only migrate tasks from one CPU to another to resolve imbalances in runqueue sizes. • Provide good interactive performance. – Even during considerable system load, the system should react and schedule interactive tasks immediately. • Provide fairness. – No process should find itself starved of timeslice for any reasonable amount of time. Likewise, no process should receive an unfairly high amount of timeslice. • Optimize for the common case of only one or two runnable processes, yet scale well to multiple processors, each with many processes. 21
The Priority Arrays • Each runqueuecontains two priority arrays (defined in kernel/sched. cas struct prio_array) – Active array: all tasks with timesliceleft. – Expired array: all tasks that have exhausted their timeslice. • Priority arrays provide O(1) scheduling. – Each priority array contains one queue of runnable processors per priority level. – The priority arrays also contain a priority bitmap used to efficiently discover the highest-priority runnable task in the system. 22
The Linux O(1) scheduler algorithm 23
The Priority Arrays • Each runqueuecontains two priority arrays (defined in kernel/sched. cas struct prio_array) – Active array: all tasks with timesliceleft. – Expired array: all tasks that have exhausted their timeslice. • Priority arrays provide O(1) scheduling. – Each priority array contains one queue of runnable processors per priority level. – The priority arrays also contain a priority bitmap used to efficiently discover the highest-priority runnable task in the system. 24
Each runqueue contains two priority arrays – active and expired. � Each of these priority arrays contains a list of tasks indexed according to priority � runqueue Priority queue (0 -139) expired active 25
� runqueue Linux assigns higher-priority tasks longer time-slice Time quantum ≈ 1/priority tsk 1 tsk 2 tsk 3 active expired 26
� runqueue Linux chooses the task with the highest priority from the active array for execution. tsk 1 tsk 2 tsk 3 active expired 27
runqueue tsk 1 Round-robin tsk 2 tsk 3 active expired 28
runqueue tsk 1 Round-robin tsk 3 tsk 2 active expired 29
runqueue tsk 1 tsk 2 tsk 3 active expired 30
Most tasks have dynamic priorities that are based on their “nice” value (static priority) plus or minus 5 � Interactivity of a task ≈ 1/sleep_time � runqueue dyn. Prio = static. Prio + bonus = -5 ~ +5 bonus ≈ 1/sleep_time tsk 1 tsk 3 tsk 2 tsk 3 I/O bound active expired 31
� runqueue When all tasks have exhausted their time slices, the two priority arrays are exchanged! tsk 1 tsk 3 tsk 2 active expired 32
The O(1) scheduling algorithm sched_find_first_bit() 1 1 1 tsk 3 tsk 2 33
The O(1) scheduling algorithm Insert O(1) 1 1 1 Remove O(1) find first set bit O(1) 34
find first set bit O(1) word >>= 8; static inline unsigned long __ffs (unsigned long word) { int num = 0; #if BITS_PER_LONG == 64 if ((word & 0 xffff) == 0) { num += 32; word >>= 32; } #endif if ((word & 0 xffff) == 0) { num += 16; word >>= 16; } if ((word & 0 xff) == 0) { num += 8; } } if ((word & 0 xf) == 0) { num += 4; word >>= 4; } if ((word & 0 x 3) == 0) { num += 2; word >>= 2; } if ((word & 0 x 1) == 0) num += 1; return num; 35
2. 6 scheduler – CFS • Classical schedulers compute time slices for each process in the system and allow them to run until their time slice/quantum is used up. – After that, all process need to be recalculated. • CFS considers only the wait time of a process – The task with the most need for CPU time is scheduled. 36
2. 6 scheduler – CFS • 37
2. 6 scheduler – CFS (motivation) • Traditional Unix scheduling policy HP LP HP: high priority LP: low priority 38
2. 6 scheduler – CFS (motivation) • Traditional Unix scheduling policy WT WT WT: waiting time 39
2. 6 scheduler – CFS (the idea case) task task virtualization High priority Low priority 40
2. 6 scheduler – CFS (the idea case) task task virtualization fastly slowly 41
2. 6 scheduler – CFS (the idea case) task 8 task : 3 task : 3 speed 42
time 2. 6 scheduler – CFS (the idea case) 8 8 8 3 3 3 43
time 2. 6 scheduler – CFS (the idea case) 8 8 8 3 3 3 44
time 2. 6 scheduler – CFS (the idea case) 8 8 8 3 3 3 45
time 2. 6 scheduler – CFS (the idea case) 8 8 8 3 3 3 46
time 2. 6 scheduler – CFS (the idea case) 8 8 8 3 3 3 47
time 2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 48
time 2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 49
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 50
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 51
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 52
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 53
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 54
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 55
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 56
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 57
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 58
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 59
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 60
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 61
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 62
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 63
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 64
2. 6 scheduler – CFS 65
2. 6 scheduler – the RB-Tree • To sort tasks on the red-black tree, the kernel uses the difference fair_clock -wait_runtime. – While fair_clock is a measure for the CPU time a task would have gotten if scheduling were completely fair, – wait_runtime is a direct measure for the unfairness caused by the imperfection of real systems. 66
2. 6 scheduler – issues • Different priority levels for tasks (i. e. , nice values) must be taken into account • Tasks must not be switched too often because a context switch has a certain overhead. 67
2. 6 scheduler – fields in the task_struct 68
2. 6 scheduler – fields in the task_struct • prio and normal_prio indicate the dynamic priorities, static_prio the static priority of a process. – The static priority is the priority assigned to the process when it was started. – normal_priority denotes a priority that is computed based on the static priority and the scheduling policy of the process. 69
2. 6 scheduler – fields in the task_struct • The scheduler is not limited to schedule processes, but can also work with larger entities. This allows for implementing group scheduling. 70
2. 6 scheduler – fields in the task_struct • cpus_allowed is a bit field used on multiprocessor systems to restrict the CPUs on which a process may run. – setaffinity() – getaffinity() 71
2. 6 scheduler – priority 72
2. 6 scheduler – priority kernel/sched. c static const int prio_to_weight[40] /* -20 */ /* -15 */ /* -10 */ /* -5 */ /* 0 */ 88761, 29154, 9548, 3121, 1024, 71755, 23254, 7620, 2501, 820, 56483, 18705, 6100, 1991, 655, 46273, 14949, 4904, 1586, 526, 36291, 11916, 3906, 1277, 423, /* /* /* }; 335, 110, 36, 272, 87, 29, 215, 172, 137, 5 */ 10 */ 15 */ 70, 23, 56, 18, = { 45, 15, 73
2. 6 scheduler – priority 74
Summary • The concept of OS schedulers • Maximize throughput. – This is what system administrators care about. – How to maximize throughput (CPU & I/O). • What is the major drawback of Linux 2. 4 scheduler • Pros and cons of Linux 2. 6 schedulers – O(1) – CFS 75
- Unix architecture
- Traditional unix scheduling
- Open source task scheduler
- Tetri s
- Unix and linux difference
- What is the difference between unix and linux
- Linux disusun berdasarkan standar sistem operasi
- Linux distribution timeline
- Is unix and linux same
- Sjn scheduling
- Linux scheduling policy
- Linux scheduling classes
- Kernel linux security module m1 support
- Embedded linux vs desktop linux
- Linux outline
- Sandwich sentences
- Pts unix
- C shell programming
- Unix concurrency mechanisms
- Ipcs unix
- Beberapa sifat dan keistimewaan unix kecuali
- Cgi-bin/printenv.pl exploit
- Block diagram of kernel
- Hardware programmer
- Unix 101
- How unix works
- Bsd unix history
- Objetivo de unix
- Unix standards
- Unix network programming stevens
- Unix internals: the new frontiers
- Ar unix
- Case study of unix operating system
- Unix directory tree
- Ciprian palaghianu
- Unix programming
- Digtask
- Unix notes vtu
- Etes csc
- Buddy algorithm in linux
- History of graphical user interface
- Unix operating system
- Modification of alloc algorithm for allocating disk block
- Basic unix commands
- Unix was originally developed in
- Process control in unix
- Ascii
- Flex lexical analyzer
- Explain inode life cycle with ialloc(), iput() algorithms.
- Case study of unix operating system
- Features of unix operating system
- Explain shell interpretive cycle in unix
- Unix 2
- Unix c sys wipro
- Unix file layout
- Unix
- Posix shared memory synchronization
- Oooro
- Unix development tools
- Unix operativni sistem
- Unix sort reverse
- Unix web
- Unix password
- Unix sistema operacional
- Unix
- Sistemas operativos basados en unix
- Unix basic concepts
- Sco unix operating system
- Zos unix
- Historia de unix
- Unix process management
- Unix elf
- Unix multiple choice questions
- Unix architecture
- Why unix was developed
- Berkeley standard distribution