Multiprocessor Synchronization Algorithms 20225241 The Mutual Exclusion problem
Multiprocessor Synchronization Algorithms (20225241) The Mutual Exclusion problem Lecturer: Danny Hendler
The mutual exclusion problem (Dijkstra, 1965) We need to devise a protocol that guarantees mutually exclusive access by processes to a shared resource (such as a file, printer, etc. ) 2
The problem model (reads/writes) • Shared-memory multiprocessor: multiple processes • Processes can apply Atomic reads and writes to shared registers • Completely asynchronous 3
Mutex: formal definition loop forever Remainder code Entry code Critical section (CS) Exit code end loop Remainder code Entry code CS Exit code 4
Mutex Requirements • Mutual exclusion: No two processes are at their CS at the same time. • Deadlock-freedom: If a process is trying to enter its critical section, then some process eventually enters its critical section. • Starvation-freedom (optional): If a process is trying to enter its critical section, then this process must eventually enter its critical section. Assumption: processes do not fail-stop while performing the entry, CS, or exit code. 5
Candidate algorithm 1. initially: turn=0 Program for process 1 1. 2. 3. await turn=0 CS of process 0 turn: =1 await turn=1 CS of process 1 turn: =0 Does algorithm 1 satisfy mutex? Yes Does it satisfy deadlock-freedom? No 6
Candidate algorithm 2. initially: lock=0 Program for both processes 1. 2. 3. 4. await lock=0 lock: =1 CS lock: =0 Does algorithm 2 satisfy mutex? No Does it satisfy deadlock-freedom? Yes 7
Candidate algorithm 3. initially: flag[0]=false, flag[1]=false Program for process 0 Program for process 1 1. 2. 3. 4. flag[0]: =true await flag[1]=false CS of process 0 flag[0]: =false flag[1]: =true await flag[0]=false CS of process 1 flag[1]: =false Does algorithm 3 satisfy mutex? Yes Does it satisfy deadlock-freedom? No 8
Peterson’s 2 -process algorithm (Peterson, 1981) initially: b[0]=false, b[1]=false, turn=0 or 1 Program for process 0 Program for process 1 1. 2. 3. 4. 5. b[0]: =true turn: =0 await (b[1]=false or turn=1) CS b[0]: =false b[1]: =true turn: =1 await (b[0]=false or turn=0) CS b[1]: =false 9
Kessels’ single-writer algorithm (Kessels, 1982) A single-writer register is a register that can be written by a single process only. initially: b[0]=false, b[1]=false, turn[0], turn[1]=0 or 1 Program for process 0 1. 2. 3. 4. 5. 6. b[0]: =true local[0]: =turn[1] turn[0]: =local[0] Await (b[1]=false or local[0]<>turn[1] CS b[0]: =false 5. 6. b[1]: =true local[1]: =1 -turn[0] turn[1]: =local[1] Await (b[0]=false or local[1]=turn[0] CS b[1]: =false Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 10
Mutual exclusion for n processes: Tournament trees Level 2 0 Level 1 0 Level 0 Processes 1 0 0 1 1 2 2 3 4 3 5 6 7 A tree-node is identified by: [level, node#] Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 11
Tournament tree based on Peterson’s 2 -process alg. Variables Per node: b[level, 2 node], b[level, 2 node+1], turn[level, node] Per process (local): level, node, id. Program for process i 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. node: =i For level = o to log n-1 do id: =node mod 2 node: = node/2 b[level, 2 node+id]: =true turn[level, node]: =id await (b[level, 2 node+1 -id]=false or turn[level, node]=1 -id) od CS for level=log n – 1 downto 0 do node: = i/2 level b[level, node]: =false od 12
The tournament tree using Peterson’s 2 -process algorithm satisfies both mutual-exclusion and starvation-freedom. 13
Contention-free step complexity The worst-case number of steps for a process to enter the CS when it runs by itself. What’s the contention-free step complexity of Peterson’s tournament tree? log n Can we do better? 14
Lamport’s fast mutual exclusion algorithm Variables Fast-lock, slow-lock initially 0 want[i] initially false Program for process i 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. want[i]: =true fast-lock: =i if slow-lock<>0 then want[i]: =false await slow-lock=0 goto 1 slow-lock: =i if fast-lock <> i then want[i]: =false for j: =1 to n do await want[j] = false od if slow-lock <> i then await slow-lock = 0 goto 1 CS slow-lock: =0 want[i]: =false 15
Schematic for Lamport’s fast mutual exclusion Indicate contention want[i]: =true, fast-lock: =i Is there contention? slow-lock< > 0? yes Wait until CS is released want[i]: =false, await slow-lock: =0 no Barrier slow-lock: =i Is there contention? fast-lock < > i? yes Wait until no other process can cross the Barrier no CS EXIT no Not last to cross Barrier? slow-lock < > i? yes Wait until CS is released 16 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Lamport’s fast mutual exclusion algorithm satisfies both mutualexclusion and deadlock-freedom. 17
First in First Out (FIFO) • Mutual Exclusion • Deadlock-freedom remainder doorway waiting • Starvation-freedom • FIFO: if process p is waiting and process q has not yet started the doorway, then q will not enter the CS before p. entry code critical section exit code Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 18
Lamport’s Bakery Algorithm entry remainder 1 2 3 4 5 n 0 0 0 doorway 1 2 3 2 4 waiting 1 2 3 2 4 CS 1 2 2 exit 1 2 2 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 time 20
Implementation 1 code of process i , i {1 , . . . , n} number[i] : = 1 + max {number[j] | (1 j n)} for j : = 1 to n (<> i) { await (number[j] = 0) (number[j] > number[i]) } critical section number[i] : = 0 number 1 2 3 4 0 0 n 0 0 integer Does this implementation work? Answer: No, it can deadlock! Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 21
Implementation 1: deadlock entry remainder 1 2 3 4 5 n 0 0 0 doorway 1 2 2 waiting 1 2 2 CS 1 deadlock exit 1 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 time 22
Implementation 2 code of process i , i {1 , . . . , n} number[i] : = 1 + max {number[j] | (1 j n)} for j : = 1 to n (<> i) { await (number[j] = 0) (number[j], j) > number[i], i) // lexicographical order } critical section number[i] : = 0 number 1 2 3 4 0 0 n 0 0 integer Does this implementation work? Answer: It does not satisfy mutual exclusion! Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 23
Implementation 2: no mutual exclusion entry remainder 1 2 3 4 5 n 0 0 0 doorway 0 1 0 2 waiting 1 2 2 CS 1 2 2 exit 1 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 time 24
The Bakery Algorithm code of process i , Doorway Bakery Waiting choosing number i {1 , . . . , n} 1: choosing[i] : = true 2: number[i] : = 1 + max {number[j] | (1 j n)} 3: choosing[i] : = false 4: for j : = 1 to n do 5: await choosing[j] = false 6: await (number[j] = 0) (number[j], j) (number[i], i) 7: od 8: critical section 9: number[i] : = 0 1 2 3 4 n false false 0 0 0 bits integer 25
Computing the maximum code of process i , i {0 , . . . , n-1} The correctness of the Bakery algorithm depends on choosing number an implicit assumption on the implementation of computing the maximum (statement 2). Below we give 0 fals 0 a correct implementation. For each process, three 1 e 0 fals additional local registers are used. They are named 2 e 0 fals local 1, local 2, local 3 and their initial values are 0 3 e fals immaterial. local 1 : = 0 e 0 fals for local 2 : = 1 to n { local 3 : = number[local 2] if local 1 < local 3 then {local 1 : = local 3} } number[i] : = 1+local 1 e n-1 fals Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 0 e 26
Question: Computing the maximum code of process i , i {0 , . . . , n-1} Is the following implementation also correct? That is, does the Bakery algorithm solve the mutual exclusion problem when the following implementation is used? Justify your answer. For each process, two additional local registers are used. They are named local 1, local 2, and their initial values are: = immaterial. local 1 i for local 2 : = 1 to n { if number[local 1] < number[local 2] then {local 1 : = local 2} } number[i] : = 1+ number[local 1] choosing number 0 fals 0 1 e 0 fals 2 e 0 fals 0 3 e fals e 0 n-1 fals 0 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 e 27
The 2 nd maximum alg. doesn’t work local 1 2 ? 2 3 4 5 n remainder 0 0 0 doorway ? 1 1 1 waiting 1 1 1 CS 1 1 1 entry 1 exit Passed process 1 Waiting for process 2 1 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 time
Properties of the Bakery algorithm • Satisfies Mutual exclusion and first-come-firstserved. • The size of number[i] is unbounded. – In practice this is not a problem, 32 bits registers will give us ticket numbers which can grow up to 2^32, a number that in practice will never be reached. • There is no need to assume that operations on the same memory location occur in some definite order; it works correctly even when it is allowed for reads which are concurrent with writes to return an arbitrary value. 29
The Black-White Bakery Algorithm Bounding the space of the Bakery Algorithm Bakery (FIFO, unbounded) The Black-White Bakery Algorithm FIFO Bounded space + one bit Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
The Black-White Bakery Algorithm color bit entry remainder 1 2 3 4 5 n 0 0 0 doorway 0 1 0 2 waiting 1 2 CS 1 2 exit 1 2 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 time 31
The Black-White Bakery Algorithm Data Structures 1 2 3 4 n choosing bits mycolor number bits {0, 1, . . . , n} color bit {black, white} Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 32
The Black-White Bakery Algorithm code of process i , i {1 , . . . , n} choosing[i] : = true mycolor[i] : = color number[i] : = 1 + max{number[j] | (1 j n) (mycolor[j] = mycolor[i])} choosing[i] : = false for j : = 0 to n do await choosing[j] = false if mycolor[j] = mycolor[i] then await (number[j] = 0) (number[j], j) (number[i], i) (mycolor[j] mycolor[i]) else await (number[j] = 0) (mycolor[i] color) (mycolor[j] = mycolor[i]) fi od critical section if mycolor[i] = black then color : = white else color : = black fi number[i] : = 0 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 33
A space lower bound for deadlock-free mutex (Burns & Lynch, 1993) How many registers must an n-process deadlock-free mutual exclusion algorithm use if it can only use single-writer registers ? We now prove that the same result holds for multi-reader-multi-writer registers, regardless of their size. 34
Some definitions required for the proof • Configuration • A quiescent configuration • Indistinguishable configurations • A P-quiescent configuration • A covered register • An execution 35
Example of indistinguishability Execution x is indistinuishable from execution y to process p • • • execution x p reads 5 from r 1 q writes 6 to r 1 p writes 7 to r 1 q writes 8 to r 1 p reads 8 from r 1 q writes 6 to r 1 68 • • • execution y p reads 5 from r 1 p writes 7 to r 1 q writes 6 to r 1 q reads 6 from r 1 q writes 8 to r 1 p reads 8 from r 1 8 The values of the shared registers must also be the same Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006 36
Illustration for Lemma 1 pi-quiescent, W covered by P C Quiescent (By P) Pj in CS (By pj) Q pj Q ~ Q 1 (by pi) pi C ~ D pi in CS C 1 R Q 1 (By P) (By pj) Z Both pi, pj in CS! pi in CS Quiescent D (by pi) D 1 37 Based on the proof in “Distributed Computing”, by Hagit Attiya & Jennifer Welch
Illustration for the simple part of Lemma 2 {pk, …, pn-1}-quiescent p 0…pk-1 cover W C 1 pk runs until it covers x ' (pk only) {pk+1, …, pn-1}-quiescent W U {x} covered x is covered P-{pk} in remainder p 0… pk write to W and exit D'1 P-{pk} D‘ 1 ~ C'2 D 1 {pk, …, pn-1}-quiescent p 0…pk-1 cover W Quiescent D 1 (by p 0… pk-1) C 2 38 Based on the proof in “Distributed Computing”, by Hagit Attiya & Jennifer Welch
Illustration for the general part of Lemma 2 quiescent 1 D 0 1 C 1 quiescent 2 … i D 1 2 C 2 {pk, …, pn-1}-quiescent p 0…pk-1 cover W 1 {pk+1, …, pn-1}-quiescent W U {x} covered {pk, …, pn-1}-quiescent p 0…pk-1 cover Wi Ci {pk, …, pn-1}-quiescent p 0…pk-1 cover W 2 ' i D'i i+1… j C’j n-1} i D‘i{pk+1, …, p ~ Di Di quiescent i+1… j Cj {pk+1, …, pn-1}-quiescent p 0…pk-1 cover Wi 39 Based on the proof in “Distributed Computing”, by Hagit Attiya & Jennifer Welch
A matching upper bound: the one-bit algorithm initially: b[i]: =false Program for process i 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. repeat b[i]: =true; j: =1 while (b[i] = true) and (j < i) do if (b[j]=true then b[i]: =false await b[j]=false j: =j+1 until b[i]=true for (j: =i+1 to n) do await b[j]=false Critical Section b[i]=false 40
Read-Modify-Write (RMW) operations Read-modify-write (w, f) do atomically prev: =w w: =f(prev) return prev Fetch-and-add(w, Δ) do atomically prev: =w w: = prev+Δ return prev Test-and-set(w) do atomically prev: =w w: =1 return prev 41
Mutual exclusion using test-and-set initially: v: =0 Program for process I 1. 2. 3. await test&set(v) = 0 Critical Section v: =0 Mutual exclusion? Yes Deadlock-freedom? Yes Starvation-freedom? No 42
Mutual exclusion using general RMW initially: v: =<0, 0> Program for process I 1. 2. 3. 4. 5. 6. position: =RMW(v, <v. first, v. last+1> ) repeat queue: =v until queue. first = position. last Critical Section RMW(v, <v. first+1, v. last> ) How many bits does this algorithm require? Unbounded number, but can be improved to 2 log 2 n 43
Lower bound on the number of bits required for mutual exclusion 44
- Slides: 43