Outline OS schedulers Unix scheduling Linux 2 4
- Slides: 81
Outline • • • OS schedulers Unix scheduling Linux 2. 4 scheduler Linux 2. 6 scheduler – O(1) scheduler – O(2) scheduler 2
Introduction preemptive & cooperative multitasking • A multitasking operating system is one that can simultaneously interleave execution of more than one process. • Multitasking operating systems come in two flavors: cooperative multitasking and preemptive multitasking. – Linux provides preemptive multitasking – MAC OS 9 and earlier being the most notable cooperative multitasking. 3
Linux scheduler – Scheduling Policy • Scheduling policy determines what runs when – fast process response time (low latency) – maximal system utilization (high throughput) • Processes classification: – I/O-bound processes: spends much of its time submitting and waiting on I/O requests – Processor-bound processes: spend much of their time executing code • Unix variants tends to favor I/O-bound processes, thus providing good process response time 4
Linux scheduler – Process Priority • Linux’s priority-based scheduling – Rank processes based on their worth and need for processor time. – Both the user and the system may set a process's priority to influence the scheduling behavior of the system. • Dynamic priority-based scheduling – Begins with an initial base priority – Then enables the scheduler to increase or decrease the priority dynamically to fulfill scheduling objectives. – E. g. , a process that is spending more time waiting on I/O will receive an elevated dynamic priority. 5
Linux scheduler – Priority Ranges • Two separate priority ranges. – nice value, from -20 to +19 with a default of 0. • Larger nice values correspond to a lower priority. (you are being nice to the other processes on the system). – real-time priority, by default range from 0 to 99. • All real-time processes are at a higher priority than normal processes. • Linux implements real-time priorities in accordance with POSIX standards on the matter. 6
2. 4 scheduler • Non-preemptible kernel – Set p->need_resched if schedule() should be invoked at the ‘next opportunity‘ (kernel => user mode). • Round-robin – task_struct->counter: number of clock ticks left to run in this scheduling slice, decremented by a timer. 7
2. 4 scheduler 1. Check if schedule() was invoked from interrupt handler (due to a bug) and panic if so. 2. Use spin_lock_irq() to lock ‘runqueue_lock’ 3. Check if a task is ‘runnable’ – in TASK_RUNNING state – in TASK_INTERRUPTIBLE state and a signal is pending 4. Examine the ‘goodness’ of each process 5. Context switch 8
2. 4 scheduler – ‘goodness’ • ‘goodness’: identifying the best candidate among all processes in the runqueue list. – ‘goodness’ = 0: the entity has exhausted its quantum. – 0 < ‘goodness’ < 1000: the entity is a conventional process/thread that has not exhausted its quantum; a higher value denotes a higher level of goodness. 9
2. 4 scheduler – ‘goodness’ if (p->mm == prev->mm) return p->counter + p->priority + 1; else return p->counter + p->priority; • A small bonus is given to the task p if it shares the address space with the previous task. 10
2. 4 scheduler - SMP run queue 11
2. 4 scheduler - SMP Examine the processor field of the processes and gives a consistent bonus (that is PROC_CHANGE_PENALTY, usually 15) to the process that was last executed on the ‘this_cpu’ CPU. 12
2. 4 scheduler - performance • The algorithm does not scale well – It is inefficient to re-compute all dynamic priorities at once. • The predefined quantum is too large for high system loads (for example: a server) • I/O-bound process boosting strategy is not optimal – a good strategy to ensure a short response time for interactive programs, but… – some batch programs with almost no user interaction are I/O-bound. 13
Recalculating Timeslices (kernel 2. 4) • Problems: – Can take a long time. Worse, it scales O(n) for n tasks on the system. – Recalculation must occur under some sort of lock protecting the task list and the individual process descriptors. This results in high lock contention. – Nondeterminism is a problem with deterministic realtime programs. 14
Processes classification • Definition: – I/O-bound processes: spends much of its time submitting and waiting on I/O requests – Processor-bound processes: spend much of their time executing code • Linux tends to favor I/O-bound processes, thus providing good process response time • How to classify processes? 15
16
0 tq≠ 0 tq=? ? ? ity ior pr ≠ um nt nd ua ou e q /O b I Tim Hig h Tim (C e PU qu bo ant un um d) =0 ks tas tq=0 17
Scheduling policy time_quantumnew = time_quantumold/2 + time_quantum_table[static_priority] & dynamic_priority ≈ time_quantumnew 18
≠ um nt nd ua ou e q /O b I Tim (C e PU qu bo ant un um d) =0 tq=0 tq=? 0 tq≠ 0 tq=? ? ? 19
≠ um nt nd ua ou e q /O b I Tim (C e PU qu bo ant un um d) =0 tq=0 tq=? 0 tq≠ 0 tq=? ? ? 20
≠ um nt nd ua ou e q /O b I Tim (C e PU qu bo ant un um d) =0 tq≠ 0 tq=0 tq=? 0 tq≠ 0 tq=? ? ? 21
2. 6 scheduler run queue task migration (put + pull) run queue 22
2. 6 scheduler – User Preemption • User preemption can occur – When returning to user-space from a system call – When returning to user-space from an interrupt handler 23
2. 6 scheduler – Kernel Preemption • The Linux kernel is a fully preemptive kernel. – It is possible to preempt a task at any point, so long as the kernel is in a state in which it is safe to reschedule. – “safe to reschedule”: kernel does not hold a lock • The Linux design: – additing of a preemption counter, preempt_count, to each process's thread_info – This count increments once for each lock that is acquired and decrements once for each lock that is released • Kernel preemption can also occur explicitly, when a task in the kernel blocks or explicitly calls schedule(). – no additional logic is required to ensure that the kernel is in a state that is safe to preempt! 24
Kernel Preemption • Kernel preemption can occur – When an interrupt handler exits, before returning to kernel-space – When kernel code becomes preemptible again – If a task in the kernel explicitly calls schedule() – If a task in the kernel blocks (which results in a call to schedule()) 25
O(1) & CFS scheduler • 2. 5 ~ 2. 6. 22: O(1) scheduler – Time complexity: O(1) – Using “run queue” (an active Q and an expired Q) to realize the ready queue • 2. 6. 23~present: Completely Fair Scheduler (CFS) – Time complexity: O(log n) – the ready queue is implemented as a red-black tree 26
O(1) scheduler • Implement fully O(1) scheduling. – Every algorithm in the new scheduler completes in constant-time, regardless of the number of running processes. (Since the 2. 5 kernel). • Implement perfect SMP scalability. – Each processor has its own locking and individual runqueue. • Implement improved SMP affinity. – Attempt to group tasks to a specific CPU and continue to run them there. – Only migrate tasks from one CPU to another to resolve imbalances in runqueue sizes. • Provide good interactive performance. – Even during considerable system load, the system should react and schedule interactive tasks immediately. • Provide fairness. – No process should find itself starved of timeslice for any reasonable amount of time. Likewise, no process should receive an unfairly high amount of timeslice. • Optimize for the common case of only one or two runnable processes, yet scale well to multiple processors, each with many processes. 27
The Priority Arrays • Each runqueuecontains two priority arrays (defined in kernel/sched. c as struct prio_array) – Active array: all tasks with timeslice left. – Expired array: all tasks that have exhausted their timeslice. • Priority arrays provide O(1) scheduling. – Each priority array contains one queue of runnable processors per priority level. – The priority arrays also contain a priority bitmap used to efficiently discover the highest-priority runnable task in the system. 28
The Linux O(1) scheduler algorithm 29
The Priority Arrays • Each runqueuecontains two priority arrays (defined in kernel/sched. cas struct prio_array) – Active array: all tasks with timesliceleft. – Expired array: all tasks that have exhausted their timeslice. • Priority arrays provide O(1) scheduling. – Each priority array contains one queue of runnable processors per priority level. – The priority arrays also contain a priority bitmap used to efficiently discover the highest-priority runnable task in the system. 30
Each runqueue contains two priority arrays – active and expired. � Each of these priority arrays contains a list of tasks indexed according to priority � runqueue Priority queue (0 -139) expired active 31
� runqueue Linux assigns higher-priority tasks longer time-slice Time quantum ≈ 1/priority tsk 1 tsk 2 tsk 3 active expired 32
� runqueue Linux chooses the task with the highest priority from the active array for execution. tsk 1 tsk 2 tsk 3 active expired 33
runqueue tsk 1 Round-robin tsk 2 tsk 3 active expired 34
runqueue tsk 1 Round-robin tsk 3 tsk 2 active expired 35
runqueue tsk 1 tsk 2 tsk 3 active expired 36
Most tasks have dynamic priorities that are based on their “nice” value (static priority) plus or minus 5 � Interactivity of a task ≈ 1/sleep_time � runqueue dyn. Prio = static. Prio + bonus = -5 ~ +5 bonus ≈ 1/sleep_time tsk 1 tsk 3 tsk 2 tsk 3 I/O bound active expired 37
� runqueue When all tasks have exhausted their time slices, the two priority arrays are exchanged! tsk 1 tsk 3 tsk 2 active expired 38
The O(1) scheduling algorithm sched_find_first_bit() 1 1 1 tsk 3 tsk 2 39
The O(1) scheduling algorithm Insert O(1) 1 1 1 Remove O(1) find first set bit O(1) 40
find first set bit O(1) word >>= 8; static inline unsigned long __ffs (unsigned long word) { int num = 0; #if BITS_PER_LONG == 64 if ((word & 0 xffff) == 0) { num += 32; word >>= 32; } #endif if ((word & 0 xffff) == 0) { num += 16; word >>= 16; } if ((word & 0 xff) == 0) { num += 8; } } if ((word & 0 xf) == 0) { num += 4; word >>= 4; } if ((word & 0 x 3) == 0) { num += 2; word >>= 2; } if ((word & 0 x 1) == 0) num += 1; return num; 41
2. 6 scheduler – CFS • Classical schedulers compute time slices for each process in the system and allow them to run until their time slice/quantum is used up. – After that, all process need to be recalculated. • CFS considers only the wait time of a process – The task with the most need for CPU time is scheduled. 42
2. 6 scheduler – CFS • 43
2. 6 scheduler – CFS (motivation) • Traditional Linux scheduling policy HP LP HP: high priority LP: low priority 44
2. 6 scheduler – CFS (motivation) • Traditional Unix scheduling policy WT WT WT: waiting time 45
2. 6 scheduler – CFS (the idea case) task task virtualization High priority Low priority 46
2. 6 scheduler – CFS (the idea case) task task virtualization fastly slowly 47
2. 6 scheduler – CFS (the idea case) task 8 : task 3 : task 3 speed 48
time 2. 6 scheduler – CFS (the idea case) 8 8 8 3 3 3 49
time 2. 6 scheduler – CFS (the idea case) 8 8 8 3 3 3 50
time 2. 6 scheduler – CFS (the idea case) 8 8 8 3 3 3 51
time 2. 6 scheduler – CFS (the idea case) 8 8 8 3 3 3 52
time 2. 6 scheduler – CFS (the idea case) 8 8 8 3 3 3 53
time 2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 54
time 2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 55
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 56
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 57
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 58
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 59
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 60
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 61
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 62
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 63
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 64
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 65
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 66
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 67
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 68
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 69
2. 6 scheduler – CFS (the implementation) 8 8 8 3 3 3 70
2. 6 scheduler – CFS 71
2. 6 scheduler – the RB-Tree • To sort tasks on the red-black tree, the kernel uses the difference fair_clock -wait_runtime. – While fair_clock is a measure for the CPU time a task would have gotten if scheduling were completely fair, – wait_runtime is a direct measure for the unfairness caused by the imperfection of real systems. 72
2. 6 scheduler – issues • Different priority levels for tasks (i. e. , nice values) must be taken into account • Tasks must not be switched too often because a context switch has a certain overhead. 73
2. 6 scheduler – fields in the task_struct 74
2. 6 scheduler – fields in the task_struct • prio and normal_prio indicate the dynamic priorities, static_prio the static priority of a process. – The static priority is the priority assigned to the process when it was started. – normal_priority denotes a priority that is computed based on the static priority and the scheduling policy of the process. 75
2. 6 scheduler – fields in the task_struct • The scheduler is not limited to schedule processes, but can also work with larger entities. This allows for implementing group scheduling. 76
2. 6 scheduler – fields in the task_struct • cpus_allowed is a bit field used on multiprocessor systems to restrict the CPUs on which a process may run. – setaffinity() – getaffinity() 77
2. 6 scheduler – priority 78
2. 6 scheduler – priority kernel/sched. c static const int prio_to_weight[40] = { /* -20 */ 88761, 71755, 56483, 46273, 36291, /* -15 */ 29154, 23254, 18705, 14949, 11916, /* -10 */ 9548, 7620, 6100, 4904, 3906, /* -5 */ 3121, 2501, 1991, 1586, 1277, /* 0 */ 1024, 820, 655, 526, 423, /* 5 */ 335, 272, 215, 172, 137, /* 10 */ 110, 87, 70, 56, 45, /* 15 */ 36, 29, 23, 18, 15, }; 79
2. 6 scheduler – priority 80
Summary • The concept of OS schedulers • Maximize throughput. – This is what system administrators care about. – How to maximize throughput (CPU & I/O). • What is the major drawback of Linux 2. 4 scheduler • Pros and cons of Linux 2. 6 schedulers – O(1) – CFS 81
- Processor
- Traditional unix scheduling
- Tetri s
- Open source job scheduler software for windows
- Perbedaan unix dan linux
- Linux distribution timeline
- Is unix and linux same
- Difference between unix and linux
- What is unix and linux difference
- Job scheduling vs process scheduling
- Linux scheduling policy
- Linux scheduling classes
- Uclinux
- Linux security modules
- Linux outline
- Quote one topic sentence
- Unix password
- Unix sistema operacional
- Unix web
- Sco unix operating system
- Unix
- Thinkfree office viewer
- Unix basic concepts
- Cronologia de los sistemas operativos
- Unix process management
- Unix elf
- Racf tutorial
- Unix architecture
- Why unix was developed
- Unix multiple choice questions
- Unix commands ppt
- Bsd berkeley
- Windows research kernel
- Http unix socket
- Unix network monitoring
- Lmd unix bank
- Tampilan sistem operasi
- Unix adalah
- Namei algorithm
- Posix basics
- Pada unix, file dibagi menjadi 3 kategori utama
- Ipc unix
- Unix device drivers
- Unix ware
- Unix and network programming
- The shell interpretive cycle in unix
- History of unix os
- Getting started with unix
- Unix adalah
- Unix process state transition diagram
- Iget algorithm in unix
- Apa sistem file dari sistem operasi unix
- Unix file apis
- Internal command in unix
- Awk unix
- So unix
- Unix valcea
- History of unix
- Fifo file api in unix
- Job control in unix
- Bamuengine
- Unix
- Unix
- Design principles of unix operating system
- Popoarele orientului antic
- Mosharaf chowdhury
- Unix makefiles
- Ece344
- Iget algorithm in unix
- Process vs program
- Unix folder permissions
- Unix s
- Define operating system
- Sigpending example
- Unix concurrency mechanisms
- Pts unix
- Command line arguments in unix
- Unix cal command exploit
- Message queue in unix
- Beberapa sifat dan keistimewaan unix kecuali
- Hardware programmer
- Unix 101