EECS 262 a Advanced Topics in Computer Systems
- Slides: 58
EECS 262 a Advanced Topics in Computer Systems Lecture 8 Transactions and Isolation Levels September 18 th, 2018 John Kubiatowicz Based on slides by Ali Ghodsi and Ion Stoica http: //www. eecs. berkeley. edu/~kubitron/cs 262
Today’s Papers • Granularity of Locks and Degrees of Consistency in a Shared Database (2 -up version) J. N. Gray, R. A. Lorie, G. R. Putzolu, I. L. Traiger. Appears In IFIP Working Conference on Modeling of Data Base Management Systems. 1975 • Generalized Isolation Level Definitions, A. Adya, B. Liskov, and P. O'Neil, 2000 • Thoughts? 9/18/2018 cs 262 a-F 18 Lecture-08 2
The ACID properties of Transactions • Atomicity: all actions in the transaction happen, or none happen • Consistency: if each transaction is consistent, and the database starts consistent, it ends up consistent, e. g. , – Balance cannot be negative – Cannot reschedule meeting on February 30 • Isolation: execution of one transaction is isolated from others • Durability: if a transaction commits, its effects persist 9/18/2018 cs 262 a-F 18 Lecture-08 3
Example: Transaction 101 BEGIN; --BEGIN TRANSACTION UPDATE accounts SET balance = balance - 100. 00 WHERE name = 'Alice'; UPDATE branches SET balance = balance - 100. 00 WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Alice'); UPDATE accounts SET balance = balance + 100. 00 WHERE name = 'Bob'; UPDATE branches SET balance = balance + 100. 00 WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Bob'); COMMIT; --COMMIT WORK Transfer $100 from Alice’s account to Bob’s account 9/18/2018 cs 262 a-F 18 Lecture-08 4
Why is it Hard? • Failures: might leave state inconsistent or cause updates to be lost – Remember last lecture? • Concurrency: might leave state inconsistent or cause updates to be lost – This lecture and the next one! 9/18/2018 cs 262 a-F 18 Lecture-08 5
Concurrency • When operations of concurrent threads are interleaved, the effect on shared state can be unexpected • Well known issue in operating systems, thread programming – Critical section in OSes – Java use of synchronized keyword 9/18/2018 cs 262 a-F 18 Lecture-08 6
Transaction Scheduling • Why not run only one transaction at a time? • Answer: low system utilization – Two transactions cannot run simultaneously even if they access different data • Goal of transaction scheduling: – Maximize system utilization, i. e. , concurrency » Interleave operations from different transactions – Preserve transaction semantics » Logically all operations in a transaction are executed atomically » Intermediate state of a transaction is not visible to other transactions 9/18/2018 cs 262 a-F 18 Lecture-08 7
Anomalies with Interleaved Execution • May violate transaction semantics, e. g. , some data read by the transaction changes before committing • Inconsistent database state, e. g. , some updates are lost • Anomalies always involves a “write”; Why? 9/18/2018 cs 262 a-F 18 Lecture-08 8
P 0 – Overwriting uncommitted data • Write-write conflict – T 2 writes value modified by T 1 before T 1 commits, e. g, T 2 overwrites W(A) before T 1 commits T 1: W(A), W(B) T 2: W(A), W(B) • Violates transaction serializability • If transactions were serial, you’d get either: – T 1’s updates of A and B – T 2’s updates of A and B 9/18/2018 cs 262 a-F 18 Lecture-08 9
P 1 – Reading uncommitted data (dirty read) • Write-read conflict (reading uncommitted data or dirty read) – T 2 reads value modified by T 1 before T 1 commits, e. g. , T 2 reads A before T 1 modifies it T 1: R(A), W(A), T 2: R(A), 9/18/2018 … cs 262 a-F 18 Lecture-08 10
P 3 – Non-repeatable reads • Read-Write conflict – T 2 reads value, after which T 1 modifies it, e. g. , T 2 reads A, after which T 1 modifies it T 1: R(A), W(A) T 2: R(A), W(A) • Example: Mary and John want to buy a TV set on Amazon but there is only one left in stock – (T 1) John logs first, but waits… – (T 2) Mary logs second and buys the TV set right away – (T 1) John decides to buy, but it is too late… 9/18/2018 cs 262 a-F 18 Lecture-08 11
Goals of Transaction Scheduling • Maximize system utilization, i. e. , concurrency – Interleave operations from different transactions • Preserve transaction semantics – Semantically equivalent to a serial schedule, i. e. , one transaction runs at a time T 1: R, W, R, W Serial schedule (T 1, then T 2): R, W, R, R, W 9/18/2018 T 2: R, W, R, R, W Serial schedule (T 2, then T 1): R, W, R, W cs 262 a-F 18 Lecture-08 12
Two Key Questions 1) Is a given schedule equivalent to a serial execution of transactions? Schedule: R, R, W, W, R, R, R, W, W Serial schedule (T 1, then T 2): R, W, R, R, W : Serial schedule (T 2, then T 1): R, W, R, W 2) How do you come up with a schedule equivalent to a serial schedule? 9/18/2018 cs 262 a-F 18 Lecture-08 13
Transaction Scheduling • Serial schedule: – A schedule that does not interleave the operations of different transactions – Transactions run serially (one at a time) • Equivalent schedules: – For any storage/database state, the effect (on storage/database) and output of executing the first schedule is identical to the effect of executing the second schedule • Serializable schedule: – A schedule that is equivalent to some serial execution of the transactions – Intuitively: with a serializable schedule you only see things that could happen in situations where you were running transactions one-at-a-time 9/18/2018 cs 262 a-F 18 Lecture-08 14
Conflict Serializable Schedules • Two operations conflict if they – Belong to different transactions – Are on the same data – At least one of them is a write • Two schedules are conflict equivalent iff: – Involve same operations of same transactions – Every pair of conflicting operations is ordered the same way • Schedule S is conflict serializable if S is conflict equivalent to some serial schedule 9/18/2018 cs 262 a-F 18 Lecture-08 15
Conflict Equivalence – Intuition • If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializable, e. g. , T 1: R(A), W(A), R(B), W(B) T 2: R(A), W(A), R(B), W(B) 9/18/2018 cs 262 a-F 18 Lecture-08 16
Conflict Equivalence – Intuition • If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializable, e. g. , T 1: R(A), W(A), R(B), W(B) T 2: R(A), W(A), R(B), W(B) 9/18/2018 cs 262 a-F 18 Lecture-08 17
Conflict Equivalence – Intuition • If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializable, e. g. , T 1: R(A), W(A) T 2: R(A), W(A), • Is this schedule serializable? 9/18/2018 cs 262 a-F 18 Lecture-08 18
Dependency Graph • Dependency graph: – Transactions represented as nodes – Edge from Ti to Tj: » an operation of Ti conflicts with an operation of Tj » Ti appears earlier than Tj in the schedule • Theorem: Schedule is conflict serializable if and only if its dependency graph is acyclic 9/18/2018 cs 262 a-F 18 Lecture-08 19
Example • Conflict serializable schedule: T 1: R(A), W(A), R(B), W(B) T 2: R(A), W(A), R(B), W(B) T 1 A B T 2 Dependency graph • No cycle! 9/18/2018 cs 262 a-F 18 Lecture-08 20
Example • Conflict that is not serializable: T 1: R(A), W(A), R(B), W(B) T 2: R(A), W(A), R(B), W(B) A T 1 B T 2 Dependency graph • Cycle: The output of T 1 depends on T 2, and vice-versa 9/18/2018 cs 262 a-F 18 Lecture-08 21
Notes on Conflict Serializability • Conflict Serializability doesn’t allow all schedules that you would consider correct – This is because it is strictly syntactic - it doesn’t consider the meanings of the operations or the data • Many times, Conflict Serializability is what gets used, because it can be done efficiently – See isolation degrees/levels next • Two-phase locking (2 PL) is how we implement it 9/18/2018 cs 262 a-F 18 Lecture-08 22
Serializability ≠ Conflict Serializability • Following schedule is not conflict serializable T 1: R(A), W(A), T 2: W(A), T 3: WA Dependency graph A A T 1 T 2 A A T 3 • However, the schedule is serializable since its output is equivalent with the following serial schedule T 1: R(A), W(A), T 2: W(A), T 3: WA • Note: deciding whether a schedule is serializable (not conflict-serializable) is NP-complete 9/18/2018 cs 262 a-F 18 Lecture-08 23
Locks • “Locks” to control access to data • Two types of locks: – shared (S) lock: multiple concurrent transactions allowed to operate on data – exclusive (X) lock: only one transaction can operate on data at a time HeldRequest S S Yes X Block 9/18/2018 X Block cs 262 a-F 18 Lecture-08 Lock Compatibility Matrix 24
Two-Phase Locking (2 PL) • Each transaction must obtain: S (shared) or X (exclusive) lock on data before reading – X (exclusive) lock on data before writing – • A transaction can not request additional locks once it releases any locks Thus, each transaction has a “growing phase” followed by a “shrinking phase” # Locks Held – Lock Point! 4 3 2 1 0 Growing Phase 1 9/18/2018 Shrinking 3 Phase 5 7 cs 262 a-F 18 Lecture-08 9 11 13 15 17 19 Time 25
Two-Phase Locking (2 PL) • 2 PL guarantees conflict serializability – Doesn’t allow dependency cycles. Why? • Answer: a dependency cycle leads to deadlock – Assume there is a cycle between Ti and Tj – Edge from Ti to Tj: Ti acquires lock first and Tj needs to wait – Edge from Tj to Ti: Tj acquires lock first and Ti needs to wait – Thus, both Ti and Tj wait for each other – Since with 2 PL neither Ti nor Tj release locks before acquiring all locks they need deadlock • Schedule of conflicting transactions is conflict equivalent to a serial schedule ordered by “lock point” 9/18/2018 cs 262 a-F 18 Lecture-08 26
Example • T 1 transfers $50 from account A to account B T 1: Read(A), A: =A-50, Write(A), Read(B), B: =B+50, Write(B) • T 2 outputs the total of accounts A and B T 2: Read(A), Read(B), PRINT(A+B) • Initially, A = $1000 and B = $2000 • What are the possible output values? 9/18/2018 cs 262 a-F 18 Lecture-08 27
Is this a 2 PL Schedule? 1 Lock_X(A) <granted> 2 Read(A) 3 A: = A-50 4 Write(A) 5 Unlock(A) Lock_S(A) <granted> 6 Read(A) 7 Unlock(A) 8 Lock_S(B) <granted> 9 Lock_X(B) 10 11 Read(B) <granted> 12 Unlock(B) PRINT(A+B) 13 Read(B) 14 B : = B +50 15 Write(B) 16 Unlock(B) No, and it is not serializable 9/18/2018 cs 262 a-F 18 Lecture-08 28
Is this a 2 PL Schedule? 1 Lock_X(A) <granted> 2 Read(A) 3 A: = A-50 4 Write(A) 5 Lock_X(B) <granted> 6 Unlock(A) Lock_S(A) <granted> 7 Read(A) 8 Lock_S(B) 9 Read(B) 10 B : = B +50 11 Write(B) 12 Unlock(B) <granted> 13 Unlock(A) 14 Read(B) 15 Unlock(B) 16 PRINT(A+B) Yes, it is serializable 9/18/2018 cs 262 a-F 18 Lecture-08 29
Strict 2 PL (cont’d) • All locks held by a transaction are released only when the transaction completes – In effect, “shrinking phase” is delayed until: a) Transaction has committed (commit log record on disk), or b) Decision has been made to abort the transaction (then locks can be released after rollback). 9/18/2018 cs 262 a-F 18 Lecture-08 30
Is this a Strict 2 PL schedule? 1 Lock_X(A) <granted> 2 Read(A) 3 A: = A-50 4 Write(A) 5 Lock_X(B) <granted> 6 Unlock(A) Lock_S(A) <granted> 7 Read(A) 8 Lock_S(B) 9 Read(B) 10 B : = B +50 11 Write(B) 12 Unlock(B) <granted> 13 Unlock(A) 14 Read(B) 15 Unlock(B) 16 PRINT(A+B) No: Cascading Abort Possible 9/18/2018 cs 262 a-F 18 Lecture-08 31
Granularity • What is a data item (on which a lock is obtained)? – In most modern systems: item is one tuple in a table – Sometimes (especially in early 1970 s): item is a page (with several tuples) – Sometimes: item is a whole table 9/18/2018 cs 262 a-F 18 Lecture-08 32
Granularity trade-offs • Larger granularity: fewer locks held, so less overhead; but less concurrency possible – “false conflicts” when txns deal with different parts of the same item • Smaller “fine” granularity: more locks held, so more overhead; but more concurrency is possible • System usually gets fine grain locks until there are too many of them; then it replaces them with larger granularity locks 9/18/2018 cs 262 a-F 18 Lecture-08 33
Multigranular locking • Care needed to manage conflicts properly among items of varying granularity – Note: conflicts only detectable among locks on a given item name • System gets “intention” mode locks on larger granules before getting actual S/X locks on smaller granules – Conflict rules arranged so that activities that do not commute must get conflicting locks on some item 9/18/2018 cs 262 a-F 18 Lecture-08 34
Lock Mode Conflicts HeldRequest IS IX S SIX X 9/18/2018 IS Yes Yes Block IX Yes Block Block SIX Yes Block cs 262 a-F 18 Lecture-08 X Block Block 35
Problems with serializability • The performance reduction from isolation is high – Transactions are often blocked because they want to read data that another transactions has changed • For many applications, the accuracy of the data they read is not crucial – e. g. overbooking a plane is ok in practice – e. g. your banking decisions would not be very different if you saw yesterday’s balance instead of the most up-todate 9/18/2018 cs 262 a-F 18 Lecture-08 37
Explicit isolation levels • A transaction can be declared to have isolation properties that are less stringent than serializability – However SQL standard says that default should be serializable (Gray’ 75 called this “level 3 isolation”) – In practice, most systems have weaker default level, and most transactions run at weaker levels! • Isolation levels are defined with respect to data access conflicts (phenomena) they preclude 9/18/2018 cs 262 a-F 18 Lecture-08 38
Phenomena • P 0: T 2 writes value modified by T 1 before T 1 commits – Transactions cannot be serialized by their writes • P 1: Dirty Read: T 2 reads value modified by T 1 before T 1 commits – If T 1 aborts it will be as if transaction T 2 read values that have never existed • P 2: Non-Repeatable Read: T 2 reads value, then T 1 modifies it – If T 2 attempts to re-read value it can read another value • P 3: Phantom: (see next) 9/18/2018 cs 262 a-F 18 Lecture-08 39
Phantom 1. A transaction T 1 reads a set of rows that satisfy some condition 2. Another transaction T 2 executes a statement that causes new rows to be added or removed from the search condition 3. If T 1 repeats the read it will obtain a different set of rows. 9/18/2018 cs 262 a-F 18 Lecture-08 40
Phantom Example T 1 T 2 Select count(*) where dept = “Acct” // find and S-lock (“Sue”, “Acct”, 3500) and (“Tim”, “Acct, 2400) Insert (“Joe”, ”Acct”, 2000) // X-lock the new record Commit // release locks Select sum(salary) where dept = “Acct” // find and S-lock (“Sue”, “Acct”, 3500) and (“Tim”, “Acct, 2400) and (“Joe”, “Acct”, 2000) 9/18/2018 cs 262 a-F 18 Lecture-08 41
Isolation Levels Isolation levels Degree Proscribed Phenomena Read locks on data items and phantoms (same unless noted) Write locks on data items and phantoms (always the same) 0 none Short write locks READ UNCOMMITTED 1 P 0 none Long write locks READ COMITTED 2 P 0, P 1 Short read locks Long write locks P 0, P 1, P 2 Long data-item read locks, short phantom locks Long write locks P 0, P 1, P 2, P 3 Long read locks Long write locks REAPEATABLE READ SERIALIZABLE ANSI 9/18/2018 3 Gray’s isolation degrees cs 262 a-F 18 Lecture-08 42
Generalized Isolation Levels 9/18/2018 cs 262 a-F 18 Lecture-08 43
Direct Serialization Graph (DSG) Conflict Name Description DSG Directly write-depends T 1 writes value, then T 2 overwrites it T 1 ww T 2 Directly read-depends T 1 writes value, then T 2 reads it T 1 wr T 2 Directly anti-depends T 1 reads value, then T 2 writes it T 1 rw T 2 Example: T 1: W(A), W(B), W(C) T 2: R(B), W(C) T 3: W(B) R(C), W(B) wr T 1 T 2 ww 9/18/2018 rw wr T 3 ww cs 262 a-F 18 Lecture-08 44
Disallowing P 0 • Writes by T 1 are not overwritten by T 2 while T 1 is uncommitted – Simplifies recovery from aborts, e. g. , » T 1 updates x, T 2 overwrites x , and then T 1 aborts » The system must not restore x to T 1’s pre-state » However, if T 2 aborts later, x must be restored to T 1’s prestate! – Serializes transactions based on their writes alone » all writes of T 2 must be ordered before or after all writes of T 1 G 0 only addresses this one 9/18/2018 cs 262 a-F 18 Lecture-08 45
G 0 • G 0: DSG contains a directed cycle consisting entirely of write-dependency edges – Just ensure serialization on writes alone – More permissive than Degree 1 as allows concurrent transactions to modify same object • Example: T 1: W(A) W(B), … T 2: W(A), W(B), … ww T 1 9/18/2018 ww cs 262 a-F 18 Lecture-08 T 2 46
Disallowing P 1 • Writes of T 1 could not be read by T 2 while T 1 is still uncommitted – It prevents a transaction T 2 from committing if T 2 has read the updates of a transaction that might later abort – It prevents transactions from reading intermediate modifications of other transactions – It serializes committed transactions based on their read/write-dependencies (but not their antidependencies), i. e. , » If transaction T 2 depends on T 1, T 1 cannot depend on T 2 9/18/2018 cs 262 a-F 18 Lecture-08 47
G 1 • G 1 a – Aborted reads: T 2 has read a value written by an aborted transaction T 1 • G 1 b – Intermediate Reads: Committed transaction T 2 has read an intermediate value written by transaction T 1 • G 1 c – Circular Information Flow: DSG contains a directed • cycle consisting entirely of dependency edges – Disallowing G 1 c ensures that if transaction T 2 is affected by transaction T 1, T 2 does not affect T 1 9/18/2018 cs 262 a-F 18 Lecture-08 48
Disallowing P 2 • T 1 cannot modify value read by T 2 – Precludes a transaction reading inconsistent data and making inconsistent updates 9/18/2018 cs 262 a-F 18 Lecture-08 49
G 2 • Just prevent transactions that perform inconsistent reads or writes from committing • G 2 – Anti-dependency Cycles: DSG contains a directed cycle with one or more anti-dependency edges • G 2 -item – Item Anti-dependency Cycles: DSG contains a directed cycle having one or more item-antidependency edges 9/18/2018 cs 262 a-F 18 Lecture-08 50
Generalized Isolation Levels Isolation levels G 0 G 1 G 2 -Item G 2 READ UNCOMMITTED NA NA READ COMITTED Not possible Possible REAPEATABLE READ Not possible Possible SERIALIZABLE Not possible 9/18/2018 cs 262 a-F 18 Lecture-08 51
Summary • Transactions, key abstractions on databases – Application defined sequence of operations on one or more databases that is atomic • Key challenge: trade performance to correctness – On one hand we want to interleave transactions to increase throughput – On the other hand we want to isolate transactions from each other • Solution: increase interleaving by providing – Multi-granularity locks – Relax the isolation semantics 9/18/2018 cs 262 a-F 18 Lecture-08 52
Were these good papers? • What were the authors’ goals? • What about the evaluation / metrics? • Did they convince you that this was a good system /approach? • Were there any red-flags? • What mistakes did they make? • Does the system/approach meet the “Test of Time” challenge? • How would you review this paper today? 9/18/2018 cs 262 a-F 18 Lecture-08 53
Extra slides… 9/18/2018 cs 262 a-F 18 Lecture-08 54
Snapshot Isolation (SI) • A multiversion concurrency control mechanism was described in SIGMOD ’ 95 by H. Berenson, P. Bernstein, J. Gray, J. Melton, E. O’Neil, P. O’Neil – Does not guarantee serializable execution! • Supplied by Oracle DB for “Isolation Level Serializable” (also in Postgre. SQL before rel 9. 1) • Available in Microsoft SQL Server 2005 as “Isolation Level Snapshot”, and in Postgre. SQL (since rel 9. 1) as “Isolation Level Repeatable Read” 9/18/2018 cs 262 a-F 18 Lecture-08 55
Snapshot Isolation (SI) • Read of an item may not give current value • Instead, use old versions (kept with timestamps) to find value that had been most recently committed at the time the txn started – Exception: if the txn has modified the item, use the value it wrote itself • The transaction sees a “snapshot” of the database, at an earlier time – Intuition: this should be consistent, if the database was consistent before 9/18/2018 cs 262 a-F 18 Lecture-08 56
First committer wins (FCW) • T will not be allowed to commit a modification to an item if any other transaction has committed a changed value for that item since T’s start (snapshot) • T must hold write locks on modified items at time of commit, to install them. – In practice, commit-duration write locks may be set when writes execute. – These simplify detection of conflicting modifications when T tries to write the item, instead of waiting till T tries to commit. 9/18/2018 cs 262 a-F 18 Lecture-08 57
Benefits of SI • Reading is never blocked, and reads don’t block writes • Avoids common anomalies – No dirty read – No lost update – No inconsistent read – Set-based selects are repeatable (no phantoms) • Matches common understanding of isolation: concurrent transactions are not aware of one another’s changes 9/18/2018 cs 262 a-F 18 Lecture-08 58
Is every execution serializable? • For any set of txns, if they all run with Two Phase Locking, then every interleaved execution is serializable • For some sets of txns, if they all run with SI, then every execution is serializable – Eg the txns making up TPC-C • For some sets of txns, if they all run with SI, there can be non-serializable executions – Undeclared integrity constraints can be violated 9/18/2018 cs 262 a-F 18 Lecture-08 59
- Advanced topics in computer science
- Cis 262
- Cis 262
- Cis262
- Indr 262
- Cis 262
- Advanced topics in software engineering
- Angular guard naming convention
- Angular advanced topics
- Advanced c topics
- Advanced topics in web development
- Android advanced topics
- Embedded systems topics for presentation
- Computer organization course
- Computer and society topics
- Nnemp
- Advanced fueling systems
- Advanced multimedia systems
- Advanced fluid systems inc
- Tom kilcer advanced ag systems
- Ist 421
- Advanced embedded systems
- Projection slice theorem
- Advanced cooling systems inc
- Computer architecture
- Advanced computer graphics
- Advanced computer forensics
- Advanced computer forensics
- Decision support systems and intelligent systems
- Dicapine
- Embedded systems vs cyber physical systems
- Engineering elegant systems: theory of systems engineering
- Computer control of manufacturing systems
- Computer systems engineering carleton
- Dedicated computer examples
- Computer networks vs distributed systems
- What is motion specification in computer graphics
- Unit 2 computer systems
- Installing and configure computer systems
- Unit 2 computer systems
- The protection of information in computer systems
- Computer accounting software
- Computer assisted guidance systems
- Computer systems
- Cs 1104 computer systems
- Cit595
- Computer systems components
- Computer system performance evaluation
- Cit 593
- Integration using tables
- 15-213 introduction to computer systems
- Unit 2 technology systems exam
- Unit 2 computer architecture answers
- Unit 2 computer systems
- 15-213 introduction to computer systems
- Eecs496
- Eecs 489
- Umich eecs 489
- Eecs 489