Kernel Synchronization Examples From the Linux Kernel Michael

  • Slides: 37
Download presentation
Kernel Synchronization Examples From the Linux Kernel Michael E. Locasto

Kernel Synchronization Examples From the Linux Kernel Michael E. Locasto

kernel control flow is a complicated, asynchronous interleaving BIG PICTURE: HOW CAN THE KERNEL

kernel control flow is a complicated, asynchronous interleaving BIG PICTURE: HOW CAN THE KERNEL CORRECTLY SERVICE REQUESTS?

Main Ideas / Concepts Atomic operations in x 86 Kernel locking / synchronization primitives

Main Ideas / Concepts Atomic operations in x 86 Kernel locking / synchronization primitives Kernel preemption Read-Copy-Update The “big kernel lock”

Kernel Preemption Kernel preemption is a concept in which the kernel can preempt other

Kernel Preemption Kernel preemption is a concept in which the kernel can preempt other running kernel control paths (be they on behalf of a user or another kernel thread) Acquiring a spinlock automatically disables kernel preemption (as we will see in the code)

Synchronization Primitives Atomic operations Disable interrupts (cli/sti modify IF of eflags) Lock memory bus

Synchronization Primitives Atomic operations Disable interrupts (cli/sti modify IF of eflags) Lock memory bus (x 86 lock prefix) Spin locks Semaphores Sequence Locks Read-copy-update (RCU) (lock free)

Barriers are serializing operations; they “gather” and make operations sequential. Memory barrier: x 86

Barriers are serializing operations; they “gather” and make operations sequential. Memory barrier: x 86 in/out on I/O ports x 86 lock prefix x 86 writes to CReg, SReg/eflags, DReg x 86 instr meaning lfence read barrier sfence write barrier mfence r/w barrier

Barrier Implementation

Barrier Implementation

Motivating Example: Using Semaphores in the Kernel what are: down_read, up_read, and mmap_sem

Motivating Example: Using Semaphores in the Kernel what are: down_read, up_read, and mmap_sem

Let’s start with the data structure and see where that leads… START WITH THE

Let’s start with the data structure and see where that leads… START WITH THE DATA STRUCTURE: MM->MMAP_SEM

current->mmap_sem struct mm_struct: include/linux/mm_types. h

current->mmap_sem struct mm_struct: include/linux/mm_types. h

PRIMITIVE ONE: ATOMIC TYPE AND OPERATIONS

PRIMITIVE ONE: ATOMIC TYPE AND OPERATIONS

On x 86, these operations are atomic simple asm instructions that involve 0 or

On x 86, these operations are atomic simple asm instructions that involve 0 or 1 aligned memory access read-modify-update in 1 clock cycle (e. g. , inc, dec) anything prefixed by the IA-32 ‘lock’ prefix

atomic_t: include/linux/types. h

atomic_t: include/linux/types. h

Example: Reference Counters Refcounts: atomic_t; associated with resources, but keeps count of kernel control

Example: Reference Counters Refcounts: atomic_t; associated with resources, but keeps count of kernel control paths accessing the resource

PRIMITIVE TWO: SPINLOCKS

PRIMITIVE TWO: SPINLOCKS

/include/linux/spinlock_types. h typedef struct spinlock{ struct raw_spinlock rlock; } spinlock_t; typedef struct raw_spinlock{ arch_spinlock_t

/include/linux/spinlock_types. h typedef struct spinlock{ struct raw_spinlock rlock; } spinlock_t; typedef struct raw_spinlock{ arch_spinlock_t raw_lock; } raw_spinlock_t;

arch/x 86/include/asm/spinlock_types. h#L 10 slock=1 (unlocked), slock=0 (locked)

arch/x 86/include/asm/spinlock_types. h#L 10 slock=1 (unlocked), slock=0 (locked)

spinlock API (partial) /include/linux/spinlock. h /kernel/spinlock. c

spinlock API (partial) /include/linux/spinlock. h /kernel/spinlock. c

include/linux/spinlock_api_smp. h

include/linux/spinlock_api_smp. h

Linux Tracks Lock Dependencies @ Runtime

Linux Tracks Lock Dependencies @ Runtime

Here we mainly consider Read/Write Semaphores PRIMITIVE THREE: SEMAPHORES

Here we mainly consider Read/Write Semaphores PRIMITIVE THREE: SEMAPHORES

Important Caveats about Kernel Semaphores are *not* like spinlocks in the sense that the

Important Caveats about Kernel Semaphores are *not* like spinlocks in the sense that the invoking process is put to sleep rather than busy waits. As a result, kernel semaphores should only be used by functions that can safely sleep (i. e. , not interrupt handlers)

might_sleep() leads (eventually) to:

might_sleep() leads (eventually) to:

rwsem_wake

rwsem_wake

__rwsem_do_wake On our way out, allow a writer at the front of the waiting

__rwsem_do_wake On our way out, allow a writer at the front of the waiting queue to proceed. Then allow unlimited numbers of readers to access the critical region.

Advanced Techniques Sequence Locks Read-Copy-Update (RCU) A solution to the multiple readers-writer problem in

Advanced Techniques Sequence Locks Read-Copy-Update (RCU) A solution to the multiple readers-writer problem in that a writer is permitted to advance even if readers are in the critical section. Designed to protect data structures accessed by multiple CPUs; allows many readers and writers. Readers must check both an entry and exit flag to see if data has been modified underneath them. Basic idea is simple (and in the name). Readers access data structure via a pointer; writers initially act as readers & create a copy to modify. “Writing” is just a matter of updating the pointer.

RCU Only for kernel control paths; disables preemption. Used to protect data structures accessed

RCU Only for kernel control paths; disables preemption. Used to protect data structures accessed through a pointer by adding a layer of indirection, we can reduce wholesale writes/updates to a single atomic write/update Heavy restrictions: RCU tasks cannot sleep readers do little work writers act as readers, make a copy, then update copy. Finally, they rewrite the pointer. cleanup is correspondingly complicated.

http: //lxr. linux. no/#linux+v 2. 6. 35. 14/kernel/timer. c#L 1354 RCU EXAMPLE: GETPPID(2)

http: //lxr. linux. no/#linux+v 2. 6. 35. 14/kernel/timer. c#L 1354 RCU EXAMPLE: GETPPID(2)

Does synchronization impose a significant cost? (test at user level) EXERCISE: TIME PERFORMANCE COST

Does synchronization impose a significant cost? (test at user level) EXERCISE: TIME PERFORMANCE COST OF SYNCHRONIZATION

CODE: AUTOMATICALLY DRAWING RESOURCE GRAPHS

CODE: AUTOMATICALLY DRAWING RESOURCE GRAPHS