DataIntensive Computing Systems Concurrency Control II Shivnath Babu

Data-Intensive Computing Systems Concurrency Control (II) Shivnath Babu 1

How to enforce serializable schedules? Option 1: run system, recording P(S); at end of day, check for P(S) cycles and declare if execution was good 2

How to enforce serializable schedules? Option 2: prevent P(S) cycles from occurring T 1 T 2 …. . Tn Scheduler DB 3

A locking protocol Two new actions: lock (exclusive): unlock: T 1 li (A) ui (A) T 2 scheduler lock table 4

Rule #1: Well-formed transactions Ti: … li(A) … pi(A) … ui(A). . . 5

Rule #2 Legal scheduler S = ……. . li(A) ………. . . ui(A) ……. . . no lj(A) 6

Exercise: • What schedules are legal? What transactions are well-formed? S 1 = l 1(A)l 1(B)r 1(A)w 1(B)l 2(B)u 1(A)u 1(B) r 2(B)w 2(B)u 2(B)l 3(B)r 3(B)u 3(B) S 2 = l 1(A)r 1(A)w 1(B)u 1(A)u 1(B) l 2(B)r 2(B)w 2(B)l 3(B)r 3(B)u 3(B) S 3 = l 1(A)r 1(A)u 1(A)l 1(B)w 1(B)u 1(B) l 2(B)r 2(B)w 2(B)u 2(B)l 3(B)r 3(B)u 3(B) 7

Exercise: • What schedules are legal? What transactions are well-formed? S 1 = l 1(A)l 1(B)r 1(A)w 1(B)l 2(B)u 1(A)u 1(B) r 2(B)w 2(B)u 2(B)l 3(B)r 3(B)u 3(B) S 2 = l 1(A)r 1(A)w 1(B)u 1(A)u 1(B) l 2(B)r 2(B)w 2(B)l 3(B)r 3(B)u 3(B) S 3 = l 1(A)r 1(A)u 1(A)l 1(B)w 1(B)u 1(B) l 2(B)r 2(B)w 2(B)u 2(B)l 3(B)r 3(B)u 3(B) 8

Schedule F T 1 T 2 l 1(A); Read(A) A A+100; Write(A); u 1(A) l 2(A); Read(A) A Ax 2; Write(A); u 2(A) l 2(B); Read(B) B Bx 2; Write(B); u 2(B) l 1(B); Read(B) B B+100; Write(B); u 1(B) 9

Schedule F A B T 1 T 2 25 25 l 1(A); Read(A) A A+100; Write(A); u 1(A) 125 l 2(A); Read(A) A Ax 2; Write(A); u 2(A) 250 l 2(B); Read(B) B Bx 2; Write(B); u 2(B) 50 l 1(B); Read(B) B B+100; Write(B); u 1(B) 150 250 10

Rule #3 Two phase locking (2 PL) for transactions Ti = ……. li(A) ………. . . ui(A) ……. . . no unlocks no locks 11

# locks held by Ti Time Growing Phase Shrinking Phase 12

Schedule G T 1 l 1(A); Read(A) A A+100; Write(A) l 1(B); u 1(A) T 2 delayed l 2(A); Read(A) A Ax 2; Write(A); l 2(B) 13

Schedule G T 1 l 1(A); Read(A) A A+100; Write(A) l 1(B); u 1(A) T 2 delayed l 2(A); Read(A) A Ax 2; Write(A); l 2(B) Read(B); B B+100 Write(B); u 1(B) 14

Schedule G T 1 l 1(A); Read(A) A A+100; Write(A) l 1(B); u 1(A) T 2 delayed l 2(A); Read(A) A Ax 2; Write(A); l 2(B) Read(B); B B+100 Write(B); u 1(B) l 2(B); u 2(A); Read(B) B Bx 2; Write(B); u 2(B); 15

Schedule H (T 2 reversed) T 1 l 1(A); Read(A) A A+100; Write(A) l 1(B) delayed T 2 l 2(B); Read(B) B Bx 2; Write(B) l 2(A) delayed 16

• Assume deadlocked transactions are rolled back – They have no effect – They do not appear in schedule E. g. , Schedule H = This space intentionally left blank! 17

Next step: Show that rules #1, 2, 3 conflictserializable schedules 18

Conflict rules for li(A), ui(A): • li(A), lj(A) conflict • li(A), uj(A) conflict Note: no conflict < ui(A), uj(A)>, < li(A), rj(A)>, . . . 19

Theorem Rules #1, 2, 3 conflict (2 PL) serializable schedule To help in proof: Definition Shrink(Ti) = SH(Ti) = first unlock action of Ti 20

Lemma Ti Tj in S SH(Ti) <S SH(Tj) Proof of lemma: Ti Tj means that S = … pi(A) … qj(A) …; p, q conflict By rules 1, 2: S = … pi(A) … ui(A) … lj(A). . . qj(A) … By rule 3: SH(Ti) SH(Tj) So, SH(Ti) <S SH(Tj) 21

Theorem Rules #1, 2, 3 conflict (2 PL) serializable schedule Proof: (1) Assume P(S) has cycle T 1 T 2 …. Tn T 1 (2) By lemma: SH(T 1) < SH(T 2) <. . . < SH(T 1) (3) Impossible, so P(S) acyclic (4) S is conflict serializable 22

• Beyond this simple 2 PL protocol, it is all a matter of improving performance and allowing more concurrency…. – Shared locks – Multiple granularity – Inserts, deletes, and phantoms – Other types of C. C. mechanisms 23

Shared locks So far: S =. . . l 1(A) r 1(A) u 1(A) … l 2(A) r 2(A) u 2(A) … Do not conflict Instead: S=. . . ls 1(A) r 1(A) ls 2(A) r 2(A) …. us 1(A) us 2(A) 24

Lock actions l-ti(A): lock A in t mode (t is S or X) u-ti(A): unlock t mode (t is S or X) Shorthand: ui(A): unlock whatever modes Ti has locked A 25

Rule #1 Well formed transactions Ti =. . . l-S 1(A) … r 1(A) … u 1 (A) … Ti =. . . l-X 1(A) … w 1(A) … u 1 (A) … 26

• What about transactions that read and write same object? Option 1: Request exclusive lock Ti =. . . l-X 1(A) … r 1(A). . . w 1(A). . . u(A) … 27

• What about transactions that read and write same object? Option 2: Upgrade (E. g. , need to read, but don’t know if will write…) Ti=. . . l-S 1(A) … r 1(A). . . l-X 1(A) …w 1(A). . . u(A)… Think of - Get 2 nd lock on A, or - Drop S, get X lock 28

Rule #2 Legal scheduler S =. . l-Si(A) … … ui(A) … no l-Xj(A) S =. . . l-Xi(A) … … ui(A) … no l-Xj(A) no l-Sj(A) 29

A way to summarize Rule #2 Compatibility matrix Comp S X S true false X false 30

Rule # 3 2 PL transactions No change except for upgrades: (I) If upgrade gets more locks (e. g. , S {S, X}) then no change! (II) If upgrade goes to more dominant lock atomically (e. g. , S X) - has to be atomic 31

Theorem Rules 1, 2, 3 Conf. serializable for S/X locks schedules Proof: similar to X locks case 32

Lock types beyond S/X Examples: (1) increment lock (2) update lock 33

Example (1): increment lock • Atomic increment action: INi(A) {Read(A); A A+k; Write(A)} • INi(A), INj(A) do not conflict! INj(A) INi(A) A=7 +10 +2 A=5 A=17 +10 +2 A=15 INj(A) INi(A) 34

Comp S X I 35

Comp S X I S T F F X F F F I F F T 36

Update locks A common deadlock problem with upgrades: T 1 T 2 l-S 1(A) l-S 2(A) l-X 1(A) l-X 2(A) --- Deadlock --37

Solution If Ti wants to read A and knows it may later want to write A, it requests update lock (not shared) 38

New request Comp Lock already held in S X U 39

New request Comp Lock already held in S T F S X U Tor. F X F F F U T F F -> symmetric table? 40

Note: object A may be locked in different modes at the same time. . . S 1=. . . l-S 1(A)…l-S 2(A)…l-U 3(A)… l-S 4(A)…? l-U 4(A)…? • To grant a lock in mode t, mode t must be compatible with all currently held locks on object 41

How does locking work in practice? • Every system is different (E. g. , may not even provide CONFLICT-SERIALIZABLE schedules) • But here is one (simplified) way. . . 42

Sample Locking System: (1) Don’t trust transactions to request/release locks (2) Hold all locks until transaction commits # locks time 43

Ti Read(A), Write(B) lock table Scheduler, part I l(A), Read(A), l(B), Write(B)… Scheduler, part II Read(A), Write(B) DB 44

A B C Conceptually If null, object is unlocked Lock info for B Lock info for C . . . Every possible object Lock table 45

But use hash table: H . . . A A Lock info for A . . . If object not found in hash table, it is unlocked 46

Lock info for A - example tran mode wait? Nxt T_link Object: A Group mode: U Waiting: yes List: T 1 S no T 2 U no T 3 X yes To other T 3 records 47

What are the objects we lock? Relation A Relation B Tuple A Tuple B Tuple C Disk block A . . . Disk block B ? . . . DB DB DB 48

• Locking works in any case, but should we choose small or large objects? • If we lock large objects (e. g. , Relations) – Need few locks – Low concurrency • If we lock small objects (e. g. , tuples, fields) – Need more locks – More concurrency 49

We can have it both ways!! Ask any janitor to give you the solution. . . Stall 1 Stall 2 Stall 3 Stall 4 restroom hall 50

Example T 1(IS) , T 2(S) R 1 t 2 t 3 t 4 T 1(S) 51

Example T 1(IS) , T 2(IX) R 1 t 2 T 1(S) t 3 t 4 T 2(IX) 52

Multiple granularity Comp Requestor IS IX S SIX X IS Holder IX S SIX X 53

Multiple granularity Comp Requestor IS IX S SIX X IS T Holder IX T SIX T X F T T F F F T F F F F F 54

Rules (1) Follow multiple granularity comp function (2) Lock root of tree first, any mode (3) Node Q can be locked by Ti in S or IS only if parent(Q) locked by Ti in IX or IS (4) Node Q can be locked by Ti in X, SIX, IX only if parent(Q) locked by Ti in IX, SIX (5) Ti is two-phase (6) Ti can unlock node Q only if none of Q’s children are locked by Ti 55

By same transaction Parent locked in IS IX S SIX X Child can be locked in P C 56

By same transaction Parent locked in IS IX S SIX X Child can be locked in IS, S, IX, X, SIX [S, IS] not necessary X, IX, [SIX] none P C 57

Exercise: • Can T 2 access object f 2. 2 in X mode? What locks will T 2 get? T 1(IX) t 1 R 1 T 1(IX) t 2 T 1(X) f 2. 1 f 2. 2 t 4 t 3 f 3. 1 f 3. 2 58

Exercise: • Can T 2 access object f 2. 2 in X mode? What locks will T 2 get? T 1(IX) t 1 T 1(X) t 2 f 2. 1 f 2. 2 R 1 t 4 t 3 f 3. 1 f 3. 2 59

Exercise: • Can T 2 access object f 3. 1 in X mode? What locks will T 2 get? T 1(IS) t 1 T 1(S) t 2 f 2. 1 f 2. 2 R 1 t 4 t 3 f 3. 1 f 3. 2 60

Exercise: • Can T 2 access object f 2. 2 in S mode? What locks will T 2 get? T 1(SIX) t 1 R 1 T 1(IX) t 2 T 1(X) f 2. 1 f 2. 2 t 4 t 3 f 3. 1 f 3. 2 61

Exercise: • Can T 2 access object f 2. 2 in X mode? What locks will T 2 get? T 1(SIX) t 1 R 1 T 1(IX) t 2 T 1(X) f 2. 1 f 2. 2 t 4 t 3 f 3. 1 f 3. 2 62

Insert + delete operations A. . . Z a Insert 63

Modifications to locking rules: (1) Get exclusive lock on A before deleting A (2) At insert A operation by Ti, Ti is given exclusive lock on A Are these enough? 64

Still have a problem: Phantoms Example: relation R (E#, name, …) constraint: E# is key use tuple locking R o 1 o 2 E# Name 55 Smith 75 Jones …. 65

T 1: Insert <04, Kerry, …> into R T 2: Insert <04, Bush, …> into R T 1 S 1(o 1) S 1(o 2) Check Constraint S 2(o 1) S 2(o 2) Check Constraint. . . Insert o 3[04, Kerry, . . ] T 2 Insert o 4[04, Bush, . . ] 66

Solution • Use multiple granularity tree • Before insert of node Q, lock parent(Q) in R 1 X mode t 1 t 2 t 3 67

Back to example T 1: Insert<04, Kerry> T 1 X 1(R) T 2: Insert<04, Bush> T 2 X 2(R) delayed Check constraint Insert<04, Kerry> U(R) X 2(R) Check constraint Oops! e# = 04 already in R! 68

Instead of using R, can use index on R: Example: R E#=2 E#=5 . . . Index 100<E#<200 Index 0<E#<100 . . . E#=107 E#=109 . . . 69

• This approach can be generalized to multiple indexes. . . 70

Next: • Tree-based concurrency control • Validation concurrency control 71

Example • all objects accessed through root, following pointers A B T 1 lock C T 1 lock D E T 1 lock F can we release A lock if we no longer need A? ? 72

Idea: traverse like “Monkey Bars” A T 1 lock B T 1 lock C T 1 lock D E T 1 lock F 73

Let us see why this works? • Assume all Ti start at root; exclusive lock • Ti Tj Ti locks root before Tj Root Q T i Tj • Actually works if we don’t always start at root 74

Rules: tree protocol (exclusive locks) (1) First lock by Ti may be on any item (2) After that, item Q can be locked by Ti only if parent(Q) locked by Ti (3) Items may be unlocked at any time (4) After Ti unlocks Q, it cannot relock Q 75

• Tree-like protocols are used typically for B-tree concurrency control Root E. g. , during insert, do not release parent lock, until you are certain child does not have to split 76

Validation Transactions have 3 phases: (1) Read – all DB values read – writes to temporary storage – no locking (2) Validate – check if schedule so far is serializable (3) Write – if validate ok, write to DB 77

Key idea • Make validation atomic • If T 1, T 2, T 3, … is validation order, then resulting schedule will be conflict equivalent to Ss = T 1 T 2 T 3. . . 78

To implement validation, system keeps two sets: • FIN = transactions that have finished phase 3 (and are all done) • VAL = transactions that have successfully finished phase 2 (validation) 79

Example of what validation must prevent: RS(T 2)={B} RS(T 3)={A, B} = WS(T 2)={B, D} WS(T 3)={C} T 2 start T 3 start T 2 T 3 validated time 80

allow Example of what validation must prevent: RS(T 2)={B} RS(T 3)={A, B} = WS(T 2)={B, D} WS(T 3)={C} T 2 start T 3 start T 2 T 3 validated T 2 finish phase 3 T 3 start time 81

Another thing validation must prevent: RS(T 2)={A} WS(T 2)={D, E} T 2 validated RS(T 3)={A, B} WS(T 3)={C, D} T 3 validated finish BAD: w 3(D) w 2(D) T 2 time 82

allow Another thing validation must prevent: RS(T 2)={A} WS(T 2)={D, E} T 2 RS(T 3)={A, B} WS(T 3)={C, D} T 3 validated finish T 2 time 83

Validation rules for Tj: (1) When Tj starts phase 1: ignore(Tj) FIN (2) at Tj Validation: if check (Tj) then [ VAL U {Tj}; do write phase; FIN U {Tj} ] 84

Check (Tj): For Ti VAL - IGNORE (Tj) DO IF [ WS(Ti) RS(Tj) OR Ti FIN ] THEN RETURN false; RETURN true; Is this check too restrictive ? 85

Improving Check(Tj) For Ti VAL - IGNORE (Tj) DO IF [ WS(Ti) (Ti RS(Tj) OR FIN AND WS(Ti) WS(Tj) )] THEN RETURN false; RETURN true; 86

Exercise: U: RS(U)={B} WS(U)={D} T: RS(T)={A, B} WS(T)={A, C} start validate finish W: RS(W)={A, D} WS(W)={A, C} V: RS(V)={B} WS(V)={D, E} 87

Validation (also called optimistic concurrency control) is useful in some cases: - Conflicts rare - System resources plentiful - Have real time constraints 88

Summary Have studied C. C. mechanisms used in practice - 2 PL - Multiple granularity - Tree (index) protocols - Validation 89