LAWS OF ORDER EXPENSIVE SYNCHRONIZATION IN CONCURRENT ALGORITHMS

  • Slides: 48
Download presentation
LAWS OF ORDER: EXPENSIVE SYNCHRONIZATION IN CONCURRENT ALGORITHMS CANNOT BE ELIMINATED 1 POPL '11

LAWS OF ORDER: EXPENSIVE SYNCHRONIZATION IN CONCURRENT ALGORITHMS CANNOT BE ELIMINATED 1 POPL '11 Hagit Attiya, Rachid Guerraoui, Danny Hendler, Petr Kuznetsov, Maged M. Michael, Martin Vechev Presenter: Michael Gorelik

CONTENTS. Motivation RAW & AWAR patterns Relaxed Memory Models Mutual Exclusion Examples Linearizability Examples

CONTENTS. Motivation RAW & AWAR patterns Relaxed Memory Models Mutual Exclusion Examples Linearizability Examples Relaxed semantics Examples 2

MOTIVATION Building correct and efficient concurrent algorithms is known to be a difficult. To

MOTIVATION Building correct and efficient concurrent algorithms is known to be a difficult. To achieve efficiency, designers spend significant time trying to remove unnecessary and costly synchronization. 3

MOTIVATION RAW & AWAR PATTERNS Two common synchronization patterns that frequently arise in the

MOTIVATION RAW & AWAR PATTERNS Two common synchronization patterns that frequently arise in the design of concurrent algorithms are: read after write (RAW) atomic write after read (AWAR). 4

MOTIVATION EXPENSIVE RMW OPERATIONS We will see that many of the expensive synchronization operations

MOTIVATION EXPENSIVE RMW OPERATIONS We will see that many of the expensive synchronization operations like; locks, CAS, fences etc, uses RAW or AWAR patterns. Those operation are much slower then regular read/write (sometimes 50 times slower) 5

MOTIVATION MUTUAL EXCLUSION AND LINEARIZABILITY If we are to build a mutual exclusion algorithm

MOTIVATION MUTUAL EXCLUSION AND LINEARIZABILITY If we are to build a mutual exclusion algorithm or a linearizable algorithm, then in certain sequential executions of that algorithm, we must use either RAW or AWAR. If all executions of the algorithm do not use RAW or AWAR, then the algorithm is incorrect. 6

MOTIVATION AVOIDING EXPENSIVE SYNCHRONIZATION When can we avoid RAW and AWAR patterns? When we

MOTIVATION AVOIDING EXPENSIVE SYNCHRONIZATION When can we avoid RAW and AWAR patterns? When we have no choice other then using those patterns? 7

CONTENTS. Motivation RAW & AWAR patterns Sequential consistency Relaxed Memory Models Mutual Exclusion Examples

CONTENTS. Motivation RAW & AWAR patterns Sequential consistency Relaxed Memory Models Mutual Exclusion Examples Linearizability Examples Relaxed semantics Examples 8

RAW AND AWAR PATTERN RAW (READ AFTER WRITE) The RAW pattern consists of a

RAW AND AWAR PATTERN RAW (READ AFTER WRITE) The RAW pattern consists of a process writing to some shared variable A, followed by the same process reading a different shared variable B, without that process writing to B in between. Write to A Read from B Time line 9

RAW AND AWAR PATTERN AWAR (ATOMIC WRITE AFTER READ) The AWAR pattern consists of

RAW AND AWAR PATTERN AWAR (ATOMIC WRITE AFTER READ) The AWAR pattern consists of a process reading some shared variable followed by the process writing to the same shared variable, The entire read write sequence is atomic. (RMW) ATOMIC Read from A Write to A Time line 10

MEMORY CONSISTENCY MODEL A memory consistency model for a shared address space specifies constraints

MEMORY CONSISTENCY MODEL A memory consistency model for a shared address space specifies constraints on the order in which memory operations must appear to be performed. P 1: P 2: A=1 while(flag == 0); flag=1 print A; (A, flag are zero initial) 11

RELAXED MEMORY MODELS PROGRAM ORDER Intuitively, a read should return the value of the

RELAXED MEMORY MODELS PROGRAM ORDER Intuitively, a read should return the value of the “last” write to the same memory location. In uniprocessors, “last” is precisely defined by program order, i. e. , the order in which memory operations appear in the program. This is not the case in multiprocessors 12

RELAXED MEMORY MODELS SEQUENTIAL CONSISTENCY An intuitive extension of the uniprocessor model can be

RELAXED MEMORY MODELS SEQUENTIAL CONSISTENCY An intuitive extension of the uniprocessor model can be applied to the multiprocessor case. This model is called sequential consistency. Sequential consistency requires that all memory operations appear to execute one at a time, and the operations of a single processor appear to execute in the order described by that processor’s program. 13

RELAXED MEMORY MODELS SEQUENTIAL CONSISTENCY Sequential consistency disallows many hardware and compiler optimizations that

RELAXED MEMORY MODELS SEQUENTIAL CONSISTENCY Sequential consistency disallows many hardware and compiler optimizations that are possible in uniprocessors by enforcing a strict order among shared memory operations. 14

VIOLATION OF SEQUENTIAL CONSISTENCY EXAMPLES We will see a violation of sequential consistency, over

VIOLATION OF SEQUENTIAL CONSISTENCY EXAMPLES We will see a violation of sequential consistency, over familiar hardware optimization that exist today. Buffer writing. Caching. Compiler optimization (will not be demonstrated here) 15

VIOLATION OF SEQUENTIAL CONSISTENCY BUFFER WRITING Write strategy is an important part of cache

VIOLATION OF SEQUENTIAL CONSISTENCY BUFFER WRITING Write strategy is an important part of cache design. Buffering scheme is frequently used to reduce the overhead associated with write operations. 16

VIOLATION OF SEQUENTIAL CONSISTENCY DEKKER’S ALGORITHM (BUFFER WRITING) The write is buffered, so that

VIOLATION OF SEQUENTIAL CONSISTENCY DEKKER’S ALGORITHM (BUFFER WRITING) The write is buffered, so that both process read 0 value. 17 Violation of mutual exclusion

VIOLATION OF SEQUENTIAL CONSISTENCY DEKKER’S ALGORITHM (BUFFER WRITING) What pattern (AWAR or RAW) have

VIOLATION OF SEQUENTIAL CONSISTENCY DEKKER’S ALGORITHM (BUFFER WRITING) What pattern (AWAR or RAW) have we saw here? RAW 18

VIOLATION OF SEQUENTIAL CONSISTENCY CACHING 19

VIOLATION OF SEQUENTIAL CONSISTENCY CACHING 19

VIOLATION OF SEQUENTIAL CONSISTENCY CACHING Updates for the writes of A by processors P

VIOLATION OF SEQUENTIAL CONSISTENCY CACHING Updates for the writes of A by processors P 1 and P 2 may reach processors P 3 and P 4 in a different order. Processor P 3 and P 4 can return different values for their reads of A, making the writes of A appear non-atomic (write(1) followed by write(2) vs. writ(2) followed by write(1)). 20

RELAXED MEMORY MODELS Modern processor architectures use relaxed memory models, where guaranteeing RAW order

RELAXED MEMORY MODELS Modern processor architectures use relaxed memory models, where guaranteeing RAW order among accesses to independent memory locations requires the execution of memory ordering instructions–often called memory fences or memory barriers–that enforce RAW order. 21

RELAXED MEMORY CONSISTENCY MODELS 22

RELAXED MEMORY CONSISTENCY MODELS 22

CONTENTS. Motivation RAW & AWAR patterns Sequential consistency Relaxed Memory Models Mutual Exclusion Examples

CONTENTS. Motivation RAW & AWAR patterns Sequential consistency Relaxed Memory Models Mutual Exclusion Examples Linearizability Examples Relaxed semantics Examples 23

MUTUAL EXCLUSION Mutual Exclusion: we cannot have multiple processes in their critical section at

MUTUAL EXCLUSION Mutual Exclusion: we cannot have multiple processes in their critical section at the same time. We will show that whenever a process has sequentially executed its lock section, then this execution must use RAW or AWAR. Otherwise, the algorithm does not satisfy the mutual exclusion specification and is incorrect. First we will show that a process have to write to a shared memory. Then we will show that mutual exclusion fails when avoiding RAW and AWAR. 24

N-PROCESS MUTUAL EXCLUSION WITHOUT WRITING TO SHARED MEMORY Process i: Process j: Lock_i: …

N-PROCESS MUTUAL EXCLUSION WITHOUT WRITING TO SHARED MEMORY Process i: Process j: Lock_i: … CS_i: … Unlock_i: … Lock_j: … CS_j: … Unlock_j: … Process i does not write to shared memory Without writing to shared memory, there is no way for process j to know where process i is 25 Mutual Exclusion Fails

N-PROCESS MUTUAL EXCLUSION WITHOUT USING RAW AND AWAR Process i: Process j: Lock_i: …

N-PROCESS MUTUAL EXCLUSION WITHOUT USING RAW AND AWAR Process i: Process j: Lock_i: … CS_i: … Unlock_i: … Lock_j: … CS_j: … Unlock_j: … Process i resumes its lock_i section, and Process i stops performs the shared write to X (it overwrites any changes to X done by before writing to process j), if other shared memory was shared memory X used by process j, process i can’t read (not using AWAR) this shared location without using RAW Process j performs a full sequential execution of its lock_j (process i still have not written to shared memory) 26 Mutual Exclusion Fails

N-PROCESS MUTUAL EXCLUSION EXAMPLES One of the most common lock implementation is based on

N-PROCESS MUTUAL EXCLUSION EXAMPLES One of the most common lock implementation is based on the test-and-set atomic sequence. Lock_i: while(CAS( lock, free, busy)==false) What obvious pattern (RAW or AWAR) can be seen here? ? AWAR 27

2 -PROCESS MUTUAL EXCLUSION EXAMPLES Dekker’s mutual exclusion algorithm for 2 -process is also

2 -PROCESS MUTUAL EXCLUSION EXAMPLES Dekker’s mutual exclusion algorithm for 2 -process is also a type of lock implementation. Lock_i: flag[i]=true; while (flag[1 -i]) {…. } What obvious pattern (RAW or AWAR) can be seen here? ? RAW 28

CONTENTS. Motivation RAW & AWAR patterns Sequential consistency Relaxed Memory Models Mutual Exclusion Examples

CONTENTS. Motivation RAW & AWAR patterns Sequential consistency Relaxed Memory Models Mutual Exclusion Examples Linearizability Examples Relaxed semantics Examples 29

LINEARIZABILITY AN INTUITIVE DEFINITION An algorithm is linearizable with respect to a sequential specification

LINEARIZABILITY AN INTUITIVE DEFINITION An algorithm is linearizable with respect to a sequential specification if each execution of the algorithm is equivalent to some sequential execution of the specification, where the order between the nonoverlapping methods is preserved. The equivalence is defined by comparing the arguments and results of method invocations. 30

LINEARIZABILITY EXAMPLE (A QUEUE) lin e ari za ble q. enq(x) q. enq(y) q.

LINEARIZABILITY EXAMPLE (A QUEUE) lin e ari za ble q. enq(x) q. enq(y) q. deq(x) time 31

LINEARIZABILITY USE OF RAW OR AWAR In the case of Linearizability, only some sequential

LINEARIZABILITY USE OF RAW OR AWAR In the case of Linearizability, only some sequential executions of specific methods must use either RAW or AWAR. Unlike mutual exclusion where all sequential executions of a certain method (i. e. , the lock section) must use either RAW or AWAR 2 properties of sequential execution are defined: Deterministic sequential specification. Strongly non-commutative methods. 32

LINEARIZABILITY DETERMINISTIC SEQUENTIAL SPECIFICATIONS A sequential specification is deterministic if a method executes from

LINEARIZABILITY DETERMINISTIC SEQUENTIAL SPECIFICATIONS A sequential specification is deterministic if a method executes from the same state will always produce the same result. Many classic abstract data types have deterministic specification: sets, queues, etc. 33

LINEARIZABILITY STRONGLY NON-COMMUTATIVE METHODS A method m 1 is said to be strongly non-commutative

LINEARIZABILITY STRONGLY NON-COMMUTATIVE METHODS A method m 1 is said to be strongly non-commutative if there exists some state in the specification from which m 1 executed sequentially by process p can influence the result of a method m 2 executed sequentially by process q, q = p, and vice versa, m 2 can influence the result of m 1 from the same state. m 1 and m 2 are performed by different processes. 34

LINEARIZABILITY STRONGLY NON-COMMUTATIVE METHODS EXAMPLE (SET) Sequential specification of Set: Contains(k) Add(k) Remove(k) 35

LINEARIZABILITY STRONGLY NON-COMMUTATIVE METHODS EXAMPLE (SET) Sequential specification of Set: Contains(k) Add(k) Remove(k) 35

LINEARIZABILITY STRONGLY NON-COMMUTATIVE METHODS EXAMPLE (SET) Add(k): is it a Strongly non-commutative method? Yes.

LINEARIZABILITY STRONGLY NON-COMMUTATIVE METHODS EXAMPLE (SET) Add(k): is it a Strongly non-commutative method? Yes. P 1: Add(5) : true P 2: Add(5) : false 5 P 1: Add(5) : false There exists another method where both method invocations influence each other’s result starting from some state. Set P 2: Add(5) : true 36

LINEARIZABILITY STRONGLY NON-COMMUTATIVE METHODS EXAMPLE (SET) Contains(k): is it a Strongly non-commutative method? No.

LINEARIZABILITY STRONGLY NON-COMMUTATIVE METHODS EXAMPLE (SET) Contains(k): is it a Strongly non-commutative method? No. P 1: Add(5) : true P 2: Contains(5) : true 5 P 1: Add(5) : true The result of contains method can be influenced by a preceding add or remove. However, it’s execution can’t influence the result of the other methods. Set P 2: Contains(5) : false 37

LINEARIZABILITY USE OF RAW OR AWAR INFORMAL PROOF If a method is strongly non-commutative,

LINEARIZABILITY USE OF RAW OR AWAR INFORMAL PROOF If a method is strongly non-commutative, then any of its sequential executions must perform a shared write, why? Otherwise, there is no way for the method to influence the result of any other method that is executed after it, and hence the method cannot be strongly non-commutative. 38

LINEARIZABILITY ADD(K) (RAW AND AWAR ARE NOT PRESENT) CONT’ Process i: Add(k): {… Return

LINEARIZABILITY ADD(K) (RAW AND AWAR ARE NOT PRESENT) CONT’ Process i: Add(k): {… Return res} Process i resumes and performs the shared write to X (it over-writes any changes to X done by process j), if other Process i stops shared memory was used by process j, before writing to process i can’t read this shared location without using RAW, process i returns shared memory X true (not using AWAR) Process j: Add(k): {… Return res} Process j performs a full sequential execution of Add(k) and returns true (process i still have not written to shared memory) 39

LINEARIZABILITY ADD(K) (RAW AND AWAR ARE NOT PRESENT) CONT’ If the algorithm is linearizable,

LINEARIZABILITY ADD(K) (RAW AND AWAR ARE NOT PRESENT) CONT’ If the algorithm is linearizable, there could only be two valid linearizations to a concurrent execution of Add(k). Linearization 2 P 1 P 2 Add(k) Linearization 1 Add(k) Both execution will return different results, Add() will not be deterministic. 40 40

LINEARIZABILITY EXAMPLES (CAS) CAS( m, o, n ) can be implemented trivially with a

LINEARIZABILITY EXAMPLES (CAS) CAS( m, o, n ) can be implemented trivially with a linearizable algorithm that uses an atomic hardware instruction (also called CAS) and in that cast it includes AWAR pattern (just as was shown in the test-and-set case). CAS can be implemented by a linearizable algorithm which avoids AWAR, but uses RAW (Luchangco et al). Bool WFCAS(Val ev, Val nv){ if (ev==nv) return WFRead()==ev; Blk b = L; b. X = p; if (b. Y) goto 27; …. 41

CONTENTS. Motivation RAW & AWAR patterns Sequential consistency Relaxed Memory Models Mutual Exclusion Examples

CONTENTS. Motivation RAW & AWAR patterns Sequential consistency Relaxed Memory Models Mutual Exclusion Examples Linearizability Examples Relaxed semantics Examples 42

RELAXED SEMANTICS PRACTICAL IMPLICATIONS How it is still possible to avoid RAW and AWAR?

RELAXED SEMANTICS PRACTICAL IMPLICATIONS How it is still possible to avoid RAW and AWAR? ? ? By relaxing one or more of following dimensions: Deterministic Specification. Strong Non-commutativity. Single-Owner. Execution Detectors. 43

RELAXED SEMANTICS EXAMPLES (IDEMPOTENT WORK STEALING) Idempotent Work Stealing: from deterministic to nondeterministic specification:

RELAXED SEMANTICS EXAMPLES (IDEMPOTENT WORK STEALING) Idempotent Work Stealing: from deterministic to nondeterministic specification: Deterministic specification relaxation is exemplified by the idempotent work stealing introduced by Michael et al. ( by allowing each inserted item to be extracted at least once ) This relaxation allows to avoid RAW and AWAR in the owner’s methods. 44

RELAXED SEMANTICS EXAMPLES (IDEMPOTENT WORK STEALING) CONT’ Work. Item take() { 1. 2. 3.

RELAXED SEMANTICS EXAMPLES (IDEMPOTENT WORK STEALING) CONT’ Work. Item take() { 1. 2. 3. 4. 5. 6. h = head; t = tail; if (h = t) return EMPTY; task = tasks. array[h%tasks. size]; head = h+1; return task; } 45

RELAXED SEMANTICS EXAMPLES (FIFO QUEUE) In examining concurrent algorithms for multi-consumer FIFO queues, one

RELAXED SEMANTICS EXAMPLES (FIFO QUEUE) In examining concurrent algorithms for multi-consumer FIFO queues, one notes that either locking or CAS is used in the common path of nontrivial dequeue methods that return a dequeued item. We shown that locking uses either RAW or AWAR. We also shown that CAS uses either RAW or AWAR. 46

RELAXED SEMANTICS EXAMPLES (FIFO QUEUE) FIFO Queue: from deterministic to non-deterministic specification: Dequeue can

RELAXED SEMANTICS EXAMPLES (FIFO QUEUE) FIFO Queue: from deterministic to non-deterministic specification: Dequeue can be executed by a single process, and therefore there is no need in RAW or AWAR. Data dequeue() { if (tail = head) return EMPTY; Data data = Q[head mod m]; head = head +1 mod m; return data; } 47 Lamport’s FIFO queue

QUESTIONS? 48

QUESTIONS? 48