Mutual Exclusion Nir Shavit Multiprocessor Synchronization Fall 2003

  • Slides: 110
Download presentation
Mutual Exclusion Nir Shavit Multiprocessor Synchronization Fall 2003 6/19/2021 © 2003 Herlihy and Shavit

Mutual Exclusion Nir Shavit Multiprocessor Synchronization Fall 2003 6/19/2021 © 2003 Herlihy and Shavit 2

Mutual Exclusion in Detail • • • Formal problem definitions Solutions for 2 threads

Mutual Exclusion in Detail • • • Formal problem definitions Solutions for 2 threads Solutions for n threads Fair solutions Inherent costs © 2003 Herlihy and Shavit

Warning • You will never use these protocols – Get over it • You

Warning • You will never use these protocols – Get over it • You had better understand them – The same issues show up everywhere – If you can’t reason about these, you won’t get far with “real” protocols …. © 2003 Herlihy and Shavit

Why is Concurrent Programming so Hard? • Cooking an omelet is easy • Cooking

Why is Concurrent Programming so Hard? • Cooking an omelet is easy • Cooking a five-course meal is hard • Before we can talk about programs – Need a language – Describing time and concurrency © 2003 Herlihy and Shavit

Time • “Absolute, true and mathematical time, of itself and from its own nature,

Time • “Absolute, true and mathematical time, of itself and from its own nature, flows equably without relation to anything external. ” (I. Newton, 1689) • “Time is Nature’s way of making sure that everything doesn’t happen all at once. ” (Anonymous, circa 1970) time © 2003 Herlihy and Shavit

Events • An event a 0 of thread A is – Instantaneous – No

Events • An event a 0 of thread A is – Instantaneous – No simultaneous events a 0 time © 2003 Herlihy and Shavit

Threads • A thread A is (formally) a sequence a 0, a 1, .

Threads • A thread A is (formally) a sequence a 0, a 1, . . . of events – “Trace” model – Notation: a 0 a 1 indicates order a 0 a 1 a 2 … time © 2003 Herlihy and Shavit

Example Thread Events • • • Assign to shared variable Assign to local variable

Example Thread Events • • • Assign to shared variable Assign to local variable Call method Return from called method Lots of other things … © 2003 Herlihy and Shavit

Threads are State Machines Events are transitions a 3 a 2 © 2003 Herlihy

Threads are State Machines Events are transitions a 3 a 2 © 2003 Herlihy and Shavit a 0 a 1

States • Thread State – Program counter – Local variables • System state –

States • Thread State – Program counter – Local variables • System state – Object fields (shared variables) – Union of thread states © 2003 Herlihy and Shavit

Concurrency • Thread A time • Thread B time © 2003 Herlihy and Shavit

Concurrency • Thread A time • Thread B time © 2003 Herlihy and Shavit

Interleavings • Events of two or more threads – Interleaved – Not necessarily independent

Interleavings • Events of two or more threads – Interleaved – Not necessarily independent (why? ) time © 2003 Herlihy and Shavit

Intervals • An interval A 0 =(a 0, a 1) is – Time between

Intervals • An interval A 0 =(a 0, a 1) is – Time between events a 0 and a 1 a 0 A 0 a 1 time © 2003 Herlihy and Shavit

Intervals may Overlap b 0 a 0 B 0 A 0 b 1 a

Intervals may Overlap b 0 a 0 B 0 A 0 b 1 a 1 time © 2003 Herlihy and Shavit

Intervals may be Disjoint b 0 a 0 A 0 B 0 b 1

Intervals may be Disjoint b 0 a 0 A 0 B 0 b 1 a 1 time © 2003 Herlihy and Shavit

Precedence Interval A 0 precedes interval B 0 b 0 a 0 A 0

Precedence Interval A 0 precedes interval B 0 b 0 a 0 A 0 B 0 b 1 a 1 time © 2003 Herlihy and Shavit

Precedence A 0 B 0 • Notation: A 0 B 0 • Formally, –

Precedence A 0 B 0 • Notation: A 0 B 0 • Formally, – End event of A 0 before start event of B 0 – Also called “happens before” – Defines a partial order on intervals © 2003 Herlihy and Shavit

Partial Orders (you should know this already) • Irreflexive: – Never true that a

Partial Orders (you should know this already) • Irreflexive: – Never true that a a • Antisymmetric: – If a b then not true that b a • Transitive: – If a b & b c then a c • How does this differ from a total order? © 2003 Herlihy and Shavit

Repeated Events while (mumble) { a 0; a 1; } a 0 k A

Repeated Events while (mumble) { a 0; a 1; } a 0 k A 0 k k-th occurrence of event a 0 k-th occurrence of interval A 0 =(a 0, a 1) © 2003 Herlihy and Shavit

Review: Atomic Increment public class Counter { private long value; public long inc() {

Review: Atomic Increment public class Counter { private long value; public long inc() { int temp = value; value = value + 1; return temp; } } 6/19/2021 Allow only one thread at a time © 2003 Herlihy and Shavit 21

Synchronizaton public interface Lock { constructor public Lock(); public void acquire(int i); public void

Synchronizaton public interface Lock { constructor public Lock(); public void acquire(int i); public void release(int i); Before entering critical section } After entering critical section 6/19/2021 © 2003 Herlihy and Shavit 22

Synchronized Atomic Increment public class Counter { private long value; private Lock lock; public

Synchronized Atomic Increment public class Counter { private long value; private Lock lock; public long inc() { lock. acquire(); int temp = value; value = value + 1; lock. release(); return temp; } } 6/19/2021 © 2003 Herlihy and Shavit Critical section 23

Critical Sections • Let CSik be thread i’s k-th execution of the critical section

Critical Sections • Let CSik be thread i’s k-th execution of the critical section CSjk’ be j’s k’-th execution • Then either – CSik CSjk’, or – CSjk’ CSik • I. e. No overlap © 2003 Herlihy and Shavit

Formal Properties Mutual Exclusion: for every two threads i and j and integers k,

Formal Properties Mutual Exclusion: for every two threads i and j and integers k, k’, CSik CSjk’ or CSjk’ CSik Deadlock Freedom: if a thread never completes a call to acquire(), then there is an infinite sequence of critical sections being executed by other processes. Lockout Freedom: every call to acquire() will eventually complete. © 2003 Herlihy and Shavit

Two-Thread vs n-Thread Solutions • Two-thread solutions first – Illustrate most basic ideas –

Two-Thread vs n-Thread Solutions • Two-thread solutions first – Illustrate most basic ideas – Fits on one slide • Notation watch: j=1 -i sends – 1 to 0, – 0 to 1 © 2003 Herlihy and Shavit

Two-Thread Conventions public class Thread { private int i; private int j = 1

Two-Thread Conventions public class Thread { private int i; private int j = 1 -i; public void run() { … } } 6/19/2021 Method that does all the work © 2003 Herlihy and Shavit ID for this thread ID for other thread 27

Lock 1 public class Lock 1 implements Lock { private bool flag[2]; Other guy

Lock 1 public class Lock 1 implements Lock { private bool flag[2]; Other guy public void acquire(int i) { int j = 1 -i; flag[i] = true; Set my flag while (flag[j]) {} } } 6/19/2021 © 2003 Herlihy and Shavit Wait for other flag to go down 28

Lock 1 public class Lock 1 implements Lock { private bool flag[2]; public void

Lock 1 public class Lock 1 implements Lock { private bool flag[2]; public void release(int i) { flag[i] = false; } } No longer interested 6/19/2021 © 2003 Herlihy and Shavit 29

Mutual Exclusion Lemma: Lock 1 satisfies mutual exclusion Assume BWOC WLOG CSA concurrent with

Mutual Exclusion Lemma: Lock 1 satisfies mutual exclusion Assume BWOC WLOG CSA concurrent with CSB. From the code: (1) write. A(flag[A] = true) read. A(flag[B] == false) CSA (3) (2) write. B(flag[B] = true) read. B(flag[A] == false) CSB (3) Bread false) write. Ba(flag[B] = true) So must have == read A’s Flag, Contradiction! A(flag[B] © 2003 Herlihy and Shavit

Deadlock Freedom • Fails deadlock-freedom – Concurrent execution can deadlock flag[i] = true; while

Deadlock Freedom • Fails deadlock-freedom – Concurrent execution can deadlock flag[i] = true; while (flag[j]){} flag[j] = true; while (flag[i]){} – Sequential execution cannot © 2003 Herlihy and Shavit

Lock 2 public class Lock 2 implements Lock { private int victim; Ultimate public

Lock 2 public class Lock 2 implements Lock { private int victim; Ultimate public void acquire(int i) { Sacrifice victim = i; while (victim == i) {}; } Wait for public void release(int i) { } other } Nothing to do 6/19/2021 © 2003 Herlihy and Shavit 32

Lock 2 Claims • Satisfies mutual exclusion – If thread i in CS –

Lock 2 Claims • Satisfies mutual exclusion – If thread i in CS – Then victim == j – Never both 0 and 1! public class Lock 2 implements Lock { private int turn; public void acquire(int i) { victim = i; while (victim == i) {}; } • Not deadlock free – Sequential deadlocks – Concurrent does not 6/19/2021 © 2003 Herlihy and Shavit 33

Peterson’s Algorithm public void acquire(int i) { int j = 1 -i; // vile

Peterson’s Algorithm public void acquire(int i) { int j = 1 -i; // vile binary hack flag[i] = true; // I’m interested victim = i; // you first while (flag[j] && victim == i) {}; } public void release(int i) { flag[i] = false; // lost interest } © 2003 Herlihy and Shavit

Mutual Exclusion public void acquire(int i) { … flag[i] = true; victim = i;

Mutual Exclusion public void acquire(int i) { … flag[i] = true; victim = i; while (flag[j] && victim == i) {}; • If thread 0 in critical section, – flag[0]=true, – victim = 1 • If thread 1 in critical section, – flag[1]=true, – victim = 0 Cannot both be true 6/19/2021 © 2003 Herlihy and Shavit 35

Deadlock Free public void acquire(int i) { … while (flag[j] && victim == i)

Deadlock Free public void acquire(int i) { … while (flag[j] && victim == i) {}; } • Thread blocked – only at while loop – only if other has the turn • One or the other must have the turn © 2003 Herlihy and Shavit

Lockout Free • Thread i blocked only if j repeatedly re-enters so that flag[j]

Lockout Free • Thread i blocked only if j repeatedly re-enters so that flag[j] == true victim == i and • When j re-enters – it sets victim to j. – So i gets in 6/19/2021 public void acquire(int i) { int j = 1 -i; flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void release(int i) { flag[i] = false; } © 2003 Herlihy and Shavit 37

The Filter Algorithm for n Threads There are n-1 “waiting rooms” called levels ncs

The Filter Algorithm for n Threads There are n-1 “waiting rooms” called levels ncs • At each level – At least one enters level – At least one blocked if many try cs • Only one thread makes it through © 2003 Herlihy and Shavit

Filter class Filter implements Lock { int level[n]; // level I want to enter

Filter class Filter implements Lock { int level[n]; // level I want to enter int victim[n]; // stop me before I advance again public void acquire(int i) { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i); // busy wait }} public void release(int i) { level[i] = 0; }} © 2003 Herlihy and Shavit

Filter class Filter implements Lock { int level[n]; // level I want to enter

Filter class Filter implements Lock { int level[n]; // level I want to enter int victim[n]; // stop me before I advance again public void acquire(int i) { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i); // busy wait }} public void release(int i) { level[i] = 0; }} One level at a time © 2003 Herlihy and Shavit

Filter class Filter implements Lock { int level[n]; // level I want to enter

Filter class Filter implements Lock { int level[n]; // level I want to enter int victim[n]; // stop me before I advance again public void acquire(int i) { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i); // busy wait Announce }} to enter public void release(int i) intention { level[i] = 0; level L }} © 2003 Herlihy and Shavit

Filter class Filter implements Lock { int level[n]; // level I want to enter

Filter class Filter implements Lock { int level[n]; // level I want to enter int victim[n]; // stop me before I advance again public void acquire(int i) { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i); // busy wait }} public void release(int i) { level[i] = 0; }} © 2003 Herlihy and Shavit Give priority to anyone but me

Filter class Filter implements Lock { Wait as long as // someone elsetoisenter at

Filter class Filter implements Lock { Wait as long as // someone elsetoisenter at same or int level[n]; level I want int victim[n]; // stop before I advance again higher level, and I’mmedesignated victim public void acquire(int i) { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i); // busy wait }} public void release(int i) { level[i] = 0; }} © 2003 Herlihy and Shavit

Filter class Filter implements Lock { int level[n]; // level I want to enter

Filter class Filter implements Lock { int level[n]; // level I want to enter int victim[n]; // stop me before I advance again public void acquire(int i) { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i); // busy wait }} public void release(int i) { it completes Thread enters level L when level[i] = 0; the loop }} © 2003 Herlihy and Shavit

Claim • Start at level L=0 • At most n-L threads enter level L

Claim • Start at level L=0 • At most n-L threads enter level L • Mutual exclusion at level L=n-1 ncs cs L=0 L=1 L=n-2 L=n-1 © 2003 Herlihy and Shavit

Induction Hypothesis • No more than n-L+1 at level L-1 • Induction step: by

Induction Hypothesis • No more than n-L+1 at level L-1 • Induction step: by contradiction • Assume all at level L-1 enter level L public void acquire(int i) { for (int L = 1; L < n; L++) { • A last to write level[i] = L; victim[L] = i; victim[L] while (($ k != i) level[k] >= L) && victim[L] == i) {}; • B is any other }} thread at level L 6/19/2021 © 2003 Herlihy and Shavit 46

First Observation (1) write. B(level[B]=L) write. B(victim[L]=B) public void acquire(int i) { for (int

First Observation (1) write. B(level[B]=L) write. B(victim[L]=B) public void acquire(int i) { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i) {}; }} Use the code, Luke! © 2003 Herlihy and Shavit

Second Verse, Same as the First (2) write. A(victim[L]=A) read. A(level[B]) public void acquire(int

Second Verse, Same as the First (2) write. A(victim[L]=A) read. A(level[B]) public void acquire(int i) { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i) {}; }} © 2003 Herlihy and Shavit

Third Observation (3) write. B(victim[L]=B) write. A(victim[L]=A) By Hypothesis, A is the last thread

Third Observation (3) write. B(victim[L]=B) write. A(victim[L]=A) By Hypothesis, A is the last thread to write victim[L] © 2003 Herlihy and Shavit

Combining Observations (1) write. B(level[B]=L) write. B(victim[L]=B) (3) write. B(victim[L]=B) write. A(victim[L]=A) (2) write.

Combining Observations (1) write. B(level[B]=L) write. B(victim[L]=B) (3) write. B(victim[L]=B) write. A(victim[L]=A) (2) write. A(victim[L]=A) read. A(level[B]) So A read level[B]>=L and could not have entered level L – a contradiction 6/19/2021 © 2003 Herlihy and Shavit 50

r-Bounded Waiting • Divide lock. acquire() into 2 parts: – Doorway interval: • Written

r-Bounded Waiting • Divide lock. acquire() into 2 parts: – Doorway interval: • Written DA • always finishes in finite steps – Waiting interval: • Written WA • may take unbounded steps © 2003 Herlihy and Shavit

r-Bounded Waiting • For threads A and B: – If DAk DB j •

r-Bounded Waiting • For threads A and B: – If DAk DB j • A’s k-th doorway precedes B’s j-th doorway – Then CSAk CSBj+r • A’s k-th critical section precedes B’s (j+r)-th critical section • B cannot overtake A by more than r times • First-come-first-served means r = 0. © 2003 Herlihy and Shavit

Fairness Again • Filter Lock satisfies properties: – No one starves (no lockout) –

Fairness Again • Filter Lock satisfies properties: – No one starves (no lockout) – But no fairness: there is no bound r on waiting © 2003 Herlihy and Shavit

Bakery Algorithm • Basic Idea – Take a “number” – Wait until lower numbers

Bakery Algorithm • Basic Idea – Take a “number” – Wait until lower numbers have been served • Lexicographic order – (a, b) > (c, d) • If a > c, or a = c and b > d © 2003 Herlihy and Shavit

Bakery Algorithm class Bakery implements Lock { boolean flag[n]; int label[n]; public void acquire(int

Bakery Algorithm class Bakery implements Lock { boolean flag[n]; int label[n]; public void acquire(int i) { flag[i] = true; label[i] = max(label[0], …, label[n])+1; while ($k flag[k] && (label[i], i) > (label[k], k)); } © 2003 Herlihy and Shavit

Doorway Interval class Bakery implements Lock { boolean flag[n]; I’m int label[n]; interested public

Doorway Interval class Bakery implements Lock { boolean flag[n]; I’m int label[n]; interested public void acquire(int i) { flag[i] = true; label[i] = max(label[0], …, label[n])+1; while ($k flag[k] && (label[i], i) > (label[k], k)); } Take an increasing label © 2003 Herlihy and Shavit

Waiting Interval class Bakery implements Lock { boolean flag[n]; Someone int label[n]; else is

Waiting Interval class Bakery implements Lock { boolean flag[n]; Someone int label[n]; else is interested … public void acquire(int i) { flag[i] = true; label[i] = max(label[0], …, label[n])+1; while ($k flag[k] && (label[i], i) > (label[k], k)); } With an earlier label © 2003 Herlihy and Shavit

Bakery Algorithm class Bakery implements Lock { boolean flag[n]; I’m int label[n]; no longer

Bakery Algorithm class Bakery implements Lock { boolean flag[n]; I’m int label[n]; no longer interested … public void release(int i) { flag[i] = false; } } © 2003 Herlihy and Shavit

No Deadlock • There is always one thread with earliest label • Ties are

No Deadlock • There is always one thread with earliest label • Ties are impossible (why? ) © 2003 Herlihy and Shavit

First-Come-First-Served • If DA DBthen A’s label is earlier – write. A(label[A]) read. B(label[A])

First-Come-First-Served • If DA DBthen A’s label is earlier – write. A(label[A]) read. B(label[A]) write. B(label[B]) read. B(flag[A]) • So B is locked out while flag[A] is true 6/19/2021 class Bakery implements Lock { boolean flag[n]; int label[n]; public void acquire(int i) { flag[i] = true; label[i] = max(label[0], …, label[n])+1; while ($k flag[k] && (label[i], i) > (label[k], k)); } © 2003 Herlihy and Shavit 60

Mutual Exclusion • Suppose A and B in CS together • Suppose A has

Mutual Exclusion • Suppose A and B in CS together • Suppose A has earlier label • When B entered, it must have seen – flag[A] is false, or – label[A] > label[B] 6/19/2021 class Bakery implements Lock { boolean flag[n]; int label[n]; public void acquire(int i) { flag[i] = true; label[i] = max(label[0], …, label[n])+1; while ($k flag[k] && (label[i], i) > (label[k], k)); } © 2003 Herlihy and Shavit 61

Mutual Exclusion • Labels are strictly increasing so • B must have seen flag[A]

Mutual Exclusion • Labels are strictly increasing so • B must have seen flag[A] == false • Labeling. B read. B(flag[A]) write. A(flag[A]) Labeling. A • Which contradicts the assumption that A has an earlier label © 2003 Herlihy and Shavit

Bakery 32 Y 2 K Bug class Lock 5 implements Lock { boolean flag[n];

Bakery 32 Y 2 K Bug class Lock 5 implements Lock { boolean flag[n]; int label[n]; public void acquire(int i) { flag[i] = true; label[i] = max(label[0], …, label[n])+1; while ($k flag[k] && (label[i], i) > (label[k], k)); } FCFS breaks if label[i] overflows © 2003 Herlihy and Shavit

Does Overflow Actually Matter? • Yes – Y 2 K – 18 January 2038

Does Overflow Actually Matter? • Yes – Y 2 K – 18 January 2038 (Unix time_t rollover) – 16 -bit counters • No – 64 -bit counters • Maybe – 32 -bit counters © 2003 Herlihy and Shavit

Timestamps • Label variable is really a timestamp • Need ability to – Read

Timestamps • Label variable is really a timestamp • Need ability to – Read others’ timestamps – Compare them – Generate a later timestamp • Can we do this without overflow? © 2003 Herlihy and Shavit

d a The Good News B • One can construct a – – Wait-free

d a The Good News B • One can construct a – – Wait-free This Concurrent Timestamping system That never overflows © 2003 Herlihy and Shavit part is hard

Instead … • We will construct a – Sequential – Timestamping system – That

Instead … • We will construct a – Sequential – Timestamping system – That never overflows • Same basic idea • Except simpler © 2003 Herlihy and Shavit

Precedence Graphs 0 1 2 3 • Timestamps form directed graph • Edge x

Precedence Graphs 0 1 2 3 • Timestamps form directed graph • Edge x to y – Means x is later timestamp – We say x dominates y © 2003 Herlihy and Shavit

Unbounded Counter Precedence Graph 0 1 2 • Timestamp system as – Token-moving on

Unbounded Counter Precedence Graph 0 1 2 • Timestamp system as – Token-moving on graph – Ignore tie-breaking for now © 2003 Herlihy and Shavit 3

Unbounded Counter Precedence Graph 0 1 takes 0 takes 1 2 © 2003 Herlihy

Unbounded Counter Precedence Graph 0 1 takes 0 takes 1 2 © 2003 Herlihy and Shavit 3

Unbounded Counter Precedence Graph 0 1 2 3 and so on … takes 0

Unbounded Counter Precedence Graph 0 1 2 3 and so on … takes 0 takes 1 takes 2 © 2003 Herlihy and Shavit

Two-Thread Bounded Precedence Graph 0 2 1 © 2003 Herlihy and Shavit

Two-Thread Bounded Precedence Graph 0 2 1 © 2003 Herlihy and Shavit

Two-Thread Bounded Precedence Graph 0 2 1 © 2003 Herlihy and Shavit

Two-Thread Bounded Precedence Graph 0 2 1 © 2003 Herlihy and Shavit

Two-Thread Bounded Precedence Graph T 2 0 and so on … 2 1 ©

Two-Thread Bounded Precedence Graph T 2 0 and so on … 2 1 © 2003 Herlihy and Shavit

Three-Thread Bounded Precedence Graph? 3 0 2 1 © 2003 Herlihy and Shavit Not

Three-Thread Bounded Precedence Graph? 3 0 2 1 © 2003 Herlihy and Shavit Not clear what to do if one thread gets stuck

Graph Composition 0 0 2 2 1 1 T 3=T 2*T 2 Replace each

Graph Composition 0 0 2 2 1 1 T 3=T 2*T 2 Replace each vertex with a copy of the graph 6/19/2021 © 2003 Herlihy and Shavit 76

Three-Thread Bounded Precedence Graph T 3 0 2 2 6/19/2021 0 20 < 21

Three-Thread Bounded Precedence Graph T 3 0 2 2 6/19/2021 0 20 < 21 < 02 1 0 0 2 1 1 2 © 2003 Herlihy and Shavit 1 77

Three-Thread Bounded Precedence Graph T 3 0 2 01 0 2 6/19/2021 0 21

Three-Thread Bounded Precedence Graph T 3 0 2 01 0 2 6/19/2021 0 21 2 © 2003 Herlihy and Shavit 11 78

In General Tk = T 2 * Tk-1 6/19/2021 K threads need 3 k

In General Tk = T 2 * Tk-1 6/19/2021 K threads need 3 k nodes © 2003 Herlihy and Shavit 79

Deep Philosophical Question • The Bakery Algorithm is – Succinct, – Elegant, and –

Deep Philosophical Question • The Bakery Algorithm is – Succinct, – Elegant, and – Fair. • Q: So why isn’t it practical? • A: You have to read N fields © 2003 Herlihy and Shavit

So What? • Can we avoid reading all N fields? – Maybe using MRMW

So What? • Can we avoid reading all N fields? – Maybe using MRMW registers? • Sometimes we can: – In fast-path algorithms, number of reads and writes depends on number of actual contenders • Except we can’t really: – Worst-case still requires N distinct fields • Let’s prove it © 2003 Herlihy and Shavit

Theorem At least N multi-reader/singlewriter registers are needed to solve deadlock-free mutual exclusion. ©

Theorem At least N multi-reader/singlewriter registers are needed to solve deadlock-free mutual exclusion. © 2003 Herlihy and Shavit

Proof Each thread must write to some register A CS B C write CS

Proof Each thread must write to some register A CS B C write CS CS Can’t tell whether A is in critical section © 2003 Herlihy and Shavit

Upper Bound • You need at least N MRSW registers • Bakery algorithm –

Upper Bound • You need at least N MRSW registers • Bakery algorithm – Uses 2 N MRSW registers • So the bound is (pretty) tight • But what if we use MRMW registers? – Like the Filter algorithm? © 2003 Herlihy and Shavit

Bad News Theorem At least N multi-reader/multiwriter registers are needed to solve deadlock-free mutual

Bad News Theorem At least N multi-reader/multiwriter registers are needed to solve deadlock-free mutual exclusion. © 2003 Herlihy and Shavit

Covering State A B C Write(RA) Write(RB) Write(RC) • All registers about to be

Covering State A B C Write(RA) Write(RB) Write(RC) • All registers about to be written • CS looks empty to all threads © 2003 Herlihy and Shavit

Proof: Assume A B C Write(RB) Write(RC) Only N-1 registers © 2003 Herlihy and

Proof: Assume A B C Write(RB) Write(RC) Only N-1 registers © 2003 Herlihy and Shavit

Solo Execution A B C Write(RB) Write(RC) Writes to all registers, enters CS CS

Solo Execution A B C Write(RB) Write(RC) Writes to all registers, enters CS CS © 2003 Herlihy and Shavit

Covering State A CS B C Write(RB) Write(RC) Other threads obliterate evidence that A

Covering State A CS B C Write(RB) Write(RC) Other threads obliterate evidence that A entered CS © 2003 Herlihy and Shavit

Mutual Exclusion Fails A CS B C Write(RB) Write(RC) CS looks empty, so CS

Mutual Exclusion Fails A CS B C Write(RB) Write(RC) CS looks empty, so CS another thread gets in © 2003 Herlihy and Shavit

Proof Strategy • Need to show that a covering state is reachable from any

Proof Strategy • Need to show that a covering state is reachable from any state where CS is empty © 2003 Herlihy and Shavit

Covering State for One Register B Write(RB) B has to write to some register

Covering State for One Register B Write(RB) B has to write to some register to enter CS, so stop it just before © 2003 Herlihy and Shavit

Covering State A B Write(RA) Write(RB) • If we run B through CS 3

Covering State A B Write(RA) Write(RB) • If we run B through CS 3 times, B must return twice to some register, say RB 6/19/2021 © 2003 Herlihy and Shavit 93

Covering State A B Write(RA) Write(RB) • Start with B covering register RB •

Covering State A B Write(RA) Write(RB) • Start with B covering register RB • Run A until it is about to write to uncovered RA • Are we done? © 2003 Herlihy and Shavit

Covering State A B Write(RA) Write(RB) • A could have written to RB •

Covering State A B Write(RA) Write(RB) • A could have written to RB • CS no longer looks empty to some thread © 2003 Herlihy and Shavit

Covering State A B Write(RA) Write(RB) • Run B obliterating traces of A in

Covering State A B Write(RA) Write(RB) • Run B obliterating traces of A in register RB • Run B again until it is about to write to RB • Now we are done © 2003 Herlihy and Shavit

Inductively We Can Show A B C Write(RA) Write(RB) Write(RC) • There is a

Inductively We Can Show A B C Write(RA) Write(RB) Write(RC) • There is a covering state – Where k threads not in CS – Cover k distinct registers – k=N-1 delivers proof © 2003 Herlihy and Shavit

Mutual Exclusion in Practice • Shared FIFO queue • Written in standard Java™ ©

Mutual Exclusion in Practice • Shared FIFO queue • Written in standard Java™ © 2003 Herlihy and Shavit

Lock-Based Queue public class Queue { int head = 0, tail = 0; Item[QSIZE]

Lock-Based Queue public class Queue { int head = 0, tail = 0; Item[QSIZE] items; Acquire lock on entry, release on exit public synchronized void enq(Item x) { while (this. tail–this. head == QSIZE) this. wait(); this. items[this. tail++ % QSIZE] = x; this. notify. All(); } … }} 6/19/2021 Wait until Queue has room © 2003 Herlihy and Shavit 99

Lock-Based Queue public class Queue { Add the item int head = 0, tail

Lock-Based Queue public class Queue { Add the item int head = 0, tail = 0; Item[QSIZE] items; public synchronized Item enq() { while (this. tail–this. head == QSIZE) this. wait(); this. items[this. tail++ % QSIZE] = x; this. notify. All(); } }} Wake up sleepers 6/19/2021 © 2003 Herlihy and Shavit 100

Observations • Each method locks entire queue • No concurrency between methods • Is

Observations • Each method locks entire queue • No concurrency between methods • Is this really necessary? No And thereby hangs a tale … © 2003 Herlihy and Shavit

Lock-Free Queue • Imagine two threads – One enqueues only – One dequeues only

Lock-Free Queue • Imagine two threads – One enqueues only – One dequeues only • Do they need mutual exclusion? © 2003 Herlihy and Shavit

Lock-Free Queue public class Lock. Free. Queue { int head = 0, tail =

Lock-Free Queue public class Lock. Free. Queue { int head = 0, tail = 0; Item[QSIZE] items; public void enq(Item x) { while (tail-head == QSIZE); // busy-wait items[tail % QSIZE] = x; tail++; } public Item deq() { while (tail == head); // busy-wait Item item = items[head % QSIZE]; head++; return item; }} 6/19/2021 © 2003 Herlihy and Shavit 103

Vive La Différence • The lock-based Queue – Is coarse-grained synchronization – Critical section

Vive La Différence • The lock-based Queue – Is coarse-grained synchronization – Critical section is entire method • The lock-free Queue – Is fine-grained synchronization – Critical section is single machine instruction © 2003 Herlihy and Shavit

Critical Sections • Easy way to implement concurrent objects – Take sequential object –

Critical Sections • Easy way to implement concurrent objects – Take sequential object – Make each method a critical section • Like synchronized methods in Java™ • Problems – Blocking – No concurrency © 2003 Herlihy and Shavit

Amdahl’s Law Sequential fraction Speedup= Number of processors © 2003 Herlihy and Shavit Parallel

Amdahl’s Law Sequential fraction Speedup= Number of processors © 2003 Herlihy and Shavit Parallel fraction

Example • Ten processors • 60% concurrent, 40% sequential • How close to 10

Example • Ten processors • 60% concurrent, 40% sequential • How close to 10 -fold speedup? Speedup=2. 17= 6/19/2021 © 2003 Herlihy and Shavit 107

Example • Ten processors • 80% concurrent, 20% sequential • How close to 10

Example • Ten processors • 80% concurrent, 20% sequential • How close to 10 -fold speedup? Speedup=3. 57= 6/19/2021 © 2003 Herlihy and Shavit 108

Example • Ten processors • 90% concurrent, 10% sequential • How close to 10

Example • Ten processors • 90% concurrent, 10% sequential • How close to 10 -fold speedup? Speedup=5. 26= 6/19/2021 © 2003 Herlihy and Shavit 109

Example • Ten processors • 99% concurrent, 01% sequential • How close to 10

Example • Ten processors • 99% concurrent, 01% sequential • How close to 10 -fold speedup? Speedup=9. 17= 6/19/2021 © 2003 Herlihy and Shavit 110

The Moral • Granularity matters – Long critical sections vs atomic machine instructions –

The Moral • Granularity matters – Long critical sections vs atomic machine instructions – Smaller the granularity, greater the speedup © 2003 Herlihy and Shavit