EECS 262 a Advanced Topics in Computer Systems

  • Slides: 58
Download presentation
EECS 262 a Advanced Topics in Computer Systems Lecture 8 Transactions and Isolation Levels

EECS 262 a Advanced Topics in Computer Systems Lecture 8 Transactions and Isolation Levels September 18 th, 2018 John Kubiatowicz Based on slides by Ali Ghodsi and Ion Stoica http: //www. eecs. berkeley. edu/~kubitron/cs 262

Today’s Papers • Granularity of Locks and Degrees of Consistency in a Shared Database

Today’s Papers • Granularity of Locks and Degrees of Consistency in a Shared Database (2 -up version) J. N. Gray, R. A. Lorie, G. R. Putzolu, I. L. Traiger. Appears In IFIP Working Conference on Modeling of Data Base Management Systems. 1975 • Generalized Isolation Level Definitions, A. Adya, B. Liskov, and P. O'Neil, 2000 • Thoughts? 9/18/2018 cs 262 a-F 18 Lecture-08 2

The ACID properties of Transactions • Atomicity: all actions in the transaction happen, or

The ACID properties of Transactions • Atomicity: all actions in the transaction happen, or none happen • Consistency: if each transaction is consistent, and the database starts consistent, it ends up consistent, e. g. , – Balance cannot be negative – Cannot reschedule meeting on February 30 • Isolation: execution of one transaction is isolated from others • Durability: if a transaction commits, its effects persist 9/18/2018 cs 262 a-F 18 Lecture-08 3

Example: Transaction 101 BEGIN; --BEGIN TRANSACTION UPDATE accounts SET balance = balance - 100.

Example: Transaction 101 BEGIN; --BEGIN TRANSACTION UPDATE accounts SET balance = balance - 100. 00 WHERE name = 'Alice'; UPDATE branches SET balance = balance - 100. 00 WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Alice'); UPDATE accounts SET balance = balance + 100. 00 WHERE name = 'Bob'; UPDATE branches SET balance = balance + 100. 00 WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Bob'); COMMIT; --COMMIT WORK Transfer $100 from Alice’s account to Bob’s account 9/18/2018 cs 262 a-F 18 Lecture-08 4

Why is it Hard? • Failures: might leave state inconsistent or cause updates to

Why is it Hard? • Failures: might leave state inconsistent or cause updates to be lost – Remember last lecture? • Concurrency: might leave state inconsistent or cause updates to be lost – This lecture and the next one! 9/18/2018 cs 262 a-F 18 Lecture-08 5

Concurrency • When operations of concurrent threads are interleaved, the effect on shared state

Concurrency • When operations of concurrent threads are interleaved, the effect on shared state can be unexpected • Well known issue in operating systems, thread programming – Critical section in OSes – Java use of synchronized keyword 9/18/2018 cs 262 a-F 18 Lecture-08 6

Transaction Scheduling • Why not run only one transaction at a time? • Answer:

Transaction Scheduling • Why not run only one transaction at a time? • Answer: low system utilization – Two transactions cannot run simultaneously even if they access different data • Goal of transaction scheduling: – Maximize system utilization, i. e. , concurrency » Interleave operations from different transactions – Preserve transaction semantics » Logically all operations in a transaction are executed atomically » Intermediate state of a transaction is not visible to other transactions 9/18/2018 cs 262 a-F 18 Lecture-08 7

Anomalies with Interleaved Execution • May violate transaction semantics, e. g. , some data

Anomalies with Interleaved Execution • May violate transaction semantics, e. g. , some data read by the transaction changes before committing • Inconsistent database state, e. g. , some updates are lost • Anomalies always involves a “write”; Why? 9/18/2018 cs 262 a-F 18 Lecture-08 8

P 0 – Overwriting uncommitted data • Write-write conflict – T 2 writes value

P 0 – Overwriting uncommitted data • Write-write conflict – T 2 writes value modified by T 1 before T 1 commits, e. g, T 2 overwrites W(A) before T 1 commits T 1: W(A), W(B) T 2: W(A), W(B) • Violates transaction serializability • If transactions were serial, you’d get either: – T 1’s updates of A and B – T 2’s updates of A and B 9/18/2018 cs 262 a-F 18 Lecture-08 9

P 1 – Reading uncommitted data (dirty read) • Write-read conflict (reading uncommitted data

P 1 – Reading uncommitted data (dirty read) • Write-read conflict (reading uncommitted data or dirty read) – T 2 reads value modified by T 1 before T 1 commits, e. g. , T 2 reads A before T 1 modifies it T 1: R(A), W(A), T 2: R(A), 9/18/2018 … cs 262 a-F 18 Lecture-08 10

P 3 – Non-repeatable reads • Read-Write conflict – T 2 reads value, after

P 3 – Non-repeatable reads • Read-Write conflict – T 2 reads value, after which T 1 modifies it, e. g. , T 2 reads A, after which T 1 modifies it T 1: R(A), W(A) T 2: R(A), W(A) • Example: Mary and John want to buy a TV set on Amazon but there is only one left in stock – (T 1) John logs first, but waits… – (T 2) Mary logs second and buys the TV set right away – (T 1) John decides to buy, but it is too late… 9/18/2018 cs 262 a-F 18 Lecture-08 11

Goals of Transaction Scheduling • Maximize system utilization, i. e. , concurrency – Interleave

Goals of Transaction Scheduling • Maximize system utilization, i. e. , concurrency – Interleave operations from different transactions • Preserve transaction semantics – Semantically equivalent to a serial schedule, i. e. , one transaction runs at a time T 1: R, W, R, W Serial schedule (T 1, then T 2): R, W, R, R, W 9/18/2018 T 2: R, W, R, R, W Serial schedule (T 2, then T 1): R, W, R, W cs 262 a-F 18 Lecture-08 12

Two Key Questions 1) Is a given schedule equivalent to a serial execution of

Two Key Questions 1) Is a given schedule equivalent to a serial execution of transactions? Schedule: R, R, W, W, R, R, R, W, W Serial schedule (T 1, then T 2): R, W, R, R, W : Serial schedule (T 2, then T 1): R, W, R, W 2) How do you come up with a schedule equivalent to a serial schedule? 9/18/2018 cs 262 a-F 18 Lecture-08 13

Transaction Scheduling • Serial schedule: – A schedule that does not interleave the operations

Transaction Scheduling • Serial schedule: – A schedule that does not interleave the operations of different transactions – Transactions run serially (one at a time) • Equivalent schedules: – For any storage/database state, the effect (on storage/database) and output of executing the first schedule is identical to the effect of executing the second schedule • Serializable schedule: – A schedule that is equivalent to some serial execution of the transactions – Intuitively: with a serializable schedule you only see things that could happen in situations where you were running transactions one-at-a-time 9/18/2018 cs 262 a-F 18 Lecture-08 14

Conflict Serializable Schedules • Two operations conflict if they – Belong to different transactions

Conflict Serializable Schedules • Two operations conflict if they – Belong to different transactions – Are on the same data – At least one of them is a write • Two schedules are conflict equivalent iff: – Involve same operations of same transactions – Every pair of conflicting operations is ordered the same way • Schedule S is conflict serializable if S is conflict equivalent to some serial schedule 9/18/2018 cs 262 a-F 18 Lecture-08 15

Conflict Equivalence – Intuition • If you can transform an interleaved schedule by swapping

Conflict Equivalence – Intuition • If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializable, e. g. , T 1: R(A), W(A), R(B), W(B) T 2: R(A), W(A), R(B), W(B) 9/18/2018 cs 262 a-F 18 Lecture-08 16

Conflict Equivalence – Intuition • If you can transform an interleaved schedule by swapping

Conflict Equivalence – Intuition • If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializable, e. g. , T 1: R(A), W(A), R(B), W(B) T 2: R(A), W(A), R(B), W(B) 9/18/2018 cs 262 a-F 18 Lecture-08 17

Conflict Equivalence – Intuition • If you can transform an interleaved schedule by swapping

Conflict Equivalence – Intuition • If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializable, e. g. , T 1: R(A), W(A) T 2: R(A), W(A), • Is this schedule serializable? 9/18/2018 cs 262 a-F 18 Lecture-08 18

Dependency Graph • Dependency graph: – Transactions represented as nodes – Edge from Ti

Dependency Graph • Dependency graph: – Transactions represented as nodes – Edge from Ti to Tj: » an operation of Ti conflicts with an operation of Tj » Ti appears earlier than Tj in the schedule • Theorem: Schedule is conflict serializable if and only if its dependency graph is acyclic 9/18/2018 cs 262 a-F 18 Lecture-08 19

Example • Conflict serializable schedule: T 1: R(A), W(A), R(B), W(B) T 2: R(A),

Example • Conflict serializable schedule: T 1: R(A), W(A), R(B), W(B) T 2: R(A), W(A), R(B), W(B) T 1 A B T 2 Dependency graph • No cycle! 9/18/2018 cs 262 a-F 18 Lecture-08 20

Example • Conflict that is not serializable: T 1: R(A), W(A), R(B), W(B) T

Example • Conflict that is not serializable: T 1: R(A), W(A), R(B), W(B) T 2: R(A), W(A), R(B), W(B) A T 1 B T 2 Dependency graph • Cycle: The output of T 1 depends on T 2, and vice-versa 9/18/2018 cs 262 a-F 18 Lecture-08 21

Notes on Conflict Serializability • Conflict Serializability doesn’t allow all schedules that you would

Notes on Conflict Serializability • Conflict Serializability doesn’t allow all schedules that you would consider correct – This is because it is strictly syntactic - it doesn’t consider the meanings of the operations or the data • Many times, Conflict Serializability is what gets used, because it can be done efficiently – See isolation degrees/levels next • Two-phase locking (2 PL) is how we implement it 9/18/2018 cs 262 a-F 18 Lecture-08 22

Serializability ≠ Conflict Serializability • Following schedule is not conflict serializable T 1: R(A),

Serializability ≠ Conflict Serializability • Following schedule is not conflict serializable T 1: R(A), W(A), T 2: W(A), T 3: WA Dependency graph A A T 1 T 2 A A T 3 • However, the schedule is serializable since its output is equivalent with the following serial schedule T 1: R(A), W(A), T 2: W(A), T 3: WA • Note: deciding whether a schedule is serializable (not conflict-serializable) is NP-complete 9/18/2018 cs 262 a-F 18 Lecture-08 23

Locks • “Locks” to control access to data • Two types of locks: –

Locks • “Locks” to control access to data • Two types of locks: – shared (S) lock: multiple concurrent transactions allowed to operate on data – exclusive (X) lock: only one transaction can operate on data at a time HeldRequest S S Yes X Block 9/18/2018 X Block cs 262 a-F 18 Lecture-08 Lock Compatibility Matrix 24

Two-Phase Locking (2 PL) • Each transaction must obtain: S (shared) or X (exclusive)

Two-Phase Locking (2 PL) • Each transaction must obtain: S (shared) or X (exclusive) lock on data before reading – X (exclusive) lock on data before writing – • A transaction can not request additional locks once it releases any locks Thus, each transaction has a “growing phase” followed by a “shrinking phase” # Locks Held – Lock Point! 4 3 2 1 0 Growing Phase 1 9/18/2018 Shrinking 3 Phase 5 7 cs 262 a-F 18 Lecture-08 9 11 13 15 17 19 Time 25

Two-Phase Locking (2 PL) • 2 PL guarantees conflict serializability – Doesn’t allow dependency

Two-Phase Locking (2 PL) • 2 PL guarantees conflict serializability – Doesn’t allow dependency cycles. Why? • Answer: a dependency cycle leads to deadlock – Assume there is a cycle between Ti and Tj – Edge from Ti to Tj: Ti acquires lock first and Tj needs to wait – Edge from Tj to Ti: Tj acquires lock first and Ti needs to wait – Thus, both Ti and Tj wait for each other – Since with 2 PL neither Ti nor Tj release locks before acquiring all locks they need deadlock • Schedule of conflicting transactions is conflict equivalent to a serial schedule ordered by “lock point” 9/18/2018 cs 262 a-F 18 Lecture-08 26

Example • T 1 transfers $50 from account A to account B T 1:

Example • T 1 transfers $50 from account A to account B T 1: Read(A), A: =A-50, Write(A), Read(B), B: =B+50, Write(B) • T 2 outputs the total of accounts A and B T 2: Read(A), Read(B), PRINT(A+B) • Initially, A = $1000 and B = $2000 • What are the possible output values? 9/18/2018 cs 262 a-F 18 Lecture-08 27

Is this a 2 PL Schedule? 1 Lock_X(A) <granted> 2 Read(A) 3 A: =

Is this a 2 PL Schedule? 1 Lock_X(A) <granted> 2 Read(A) 3 A: = A-50 4 Write(A) 5 Unlock(A) Lock_S(A) <granted> 6 Read(A) 7 Unlock(A) 8 Lock_S(B) <granted> 9 Lock_X(B) 10 11 Read(B) <granted> 12 Unlock(B) PRINT(A+B) 13 Read(B) 14 B : = B +50 15 Write(B) 16 Unlock(B) No, and it is not serializable 9/18/2018 cs 262 a-F 18 Lecture-08 28

Is this a 2 PL Schedule? 1 Lock_X(A) <granted> 2 Read(A) 3 A: =

Is this a 2 PL Schedule? 1 Lock_X(A) <granted> 2 Read(A) 3 A: = A-50 4 Write(A) 5 Lock_X(B) <granted> 6 Unlock(A) Lock_S(A) <granted> 7 Read(A) 8 Lock_S(B) 9 Read(B) 10 B : = B +50 11 Write(B) 12 Unlock(B) <granted> 13 Unlock(A) 14 Read(B) 15 Unlock(B) 16 PRINT(A+B) Yes, it is serializable 9/18/2018 cs 262 a-F 18 Lecture-08 29

Strict 2 PL (cont’d) • All locks held by a transaction are released only

Strict 2 PL (cont’d) • All locks held by a transaction are released only when the transaction completes – In effect, “shrinking phase” is delayed until: a) Transaction has committed (commit log record on disk), or b) Decision has been made to abort the transaction (then locks can be released after rollback). 9/18/2018 cs 262 a-F 18 Lecture-08 30

Is this a Strict 2 PL schedule? 1 Lock_X(A) <granted> 2 Read(A) 3 A:

Is this a Strict 2 PL schedule? 1 Lock_X(A) <granted> 2 Read(A) 3 A: = A-50 4 Write(A) 5 Lock_X(B) <granted> 6 Unlock(A) Lock_S(A) <granted> 7 Read(A) 8 Lock_S(B) 9 Read(B) 10 B : = B +50 11 Write(B) 12 Unlock(B) <granted> 13 Unlock(A) 14 Read(B) 15 Unlock(B) 16 PRINT(A+B) No: Cascading Abort Possible 9/18/2018 cs 262 a-F 18 Lecture-08 31

Granularity • What is a data item (on which a lock is obtained)? –

Granularity • What is a data item (on which a lock is obtained)? – In most modern systems: item is one tuple in a table – Sometimes (especially in early 1970 s): item is a page (with several tuples) – Sometimes: item is a whole table 9/18/2018 cs 262 a-F 18 Lecture-08 32

Granularity trade-offs • Larger granularity: fewer locks held, so less overhead; but less concurrency

Granularity trade-offs • Larger granularity: fewer locks held, so less overhead; but less concurrency possible – “false conflicts” when txns deal with different parts of the same item • Smaller “fine” granularity: more locks held, so more overhead; but more concurrency is possible • System usually gets fine grain locks until there are too many of them; then it replaces them with larger granularity locks 9/18/2018 cs 262 a-F 18 Lecture-08 33

Multigranular locking • Care needed to manage conflicts properly among items of varying granularity

Multigranular locking • Care needed to manage conflicts properly among items of varying granularity – Note: conflicts only detectable among locks on a given item name • System gets “intention” mode locks on larger granules before getting actual S/X locks on smaller granules – Conflict rules arranged so that activities that do not commute must get conflicting locks on some item 9/18/2018 cs 262 a-F 18 Lecture-08 34

Lock Mode Conflicts HeldRequest IS IX S SIX X 9/18/2018 IS Yes Yes Block

Lock Mode Conflicts HeldRequest IS IX S SIX X 9/18/2018 IS Yes Yes Block IX Yes Block Block SIX Yes Block cs 262 a-F 18 Lecture-08 X Block Block 35

Problems with serializability • The performance reduction from isolation is high – Transactions are

Problems with serializability • The performance reduction from isolation is high – Transactions are often blocked because they want to read data that another transactions has changed • For many applications, the accuracy of the data they read is not crucial – e. g. overbooking a plane is ok in practice – e. g. your banking decisions would not be very different if you saw yesterday’s balance instead of the most up-todate 9/18/2018 cs 262 a-F 18 Lecture-08 37

Explicit isolation levels • A transaction can be declared to have isolation properties that

Explicit isolation levels • A transaction can be declared to have isolation properties that are less stringent than serializability – However SQL standard says that default should be serializable (Gray’ 75 called this “level 3 isolation”) – In practice, most systems have weaker default level, and most transactions run at weaker levels! • Isolation levels are defined with respect to data access conflicts (phenomena) they preclude 9/18/2018 cs 262 a-F 18 Lecture-08 38

Phenomena • P 0: T 2 writes value modified by T 1 before T

Phenomena • P 0: T 2 writes value modified by T 1 before T 1 commits – Transactions cannot be serialized by their writes • P 1: Dirty Read: T 2 reads value modified by T 1 before T 1 commits – If T 1 aborts it will be as if transaction T 2 read values that have never existed • P 2: Non-Repeatable Read: T 2 reads value, then T 1 modifies it – If T 2 attempts to re-read value it can read another value • P 3: Phantom: (see next) 9/18/2018 cs 262 a-F 18 Lecture-08 39

Phantom 1. A transaction T 1 reads a set of rows that satisfy some

Phantom 1. A transaction T 1 reads a set of rows that satisfy some condition 2. Another transaction T 2 executes a statement that causes new rows to be added or removed from the search condition 3. If T 1 repeats the read it will obtain a different set of rows. 9/18/2018 cs 262 a-F 18 Lecture-08 40

Phantom Example T 1 T 2 Select count(*) where dept = “Acct” // find

Phantom Example T 1 T 2 Select count(*) where dept = “Acct” // find and S-lock (“Sue”, “Acct”, 3500) and (“Tim”, “Acct, 2400) Insert (“Joe”, ”Acct”, 2000) // X-lock the new record Commit // release locks Select sum(salary) where dept = “Acct” // find and S-lock (“Sue”, “Acct”, 3500) and (“Tim”, “Acct, 2400) and (“Joe”, “Acct”, 2000) 9/18/2018 cs 262 a-F 18 Lecture-08 41

Isolation Levels Isolation levels Degree Proscribed Phenomena Read locks on data items and phantoms

Isolation Levels Isolation levels Degree Proscribed Phenomena Read locks on data items and phantoms (same unless noted) Write locks on data items and phantoms (always the same) 0 none Short write locks READ UNCOMMITTED 1 P 0 none Long write locks READ COMITTED 2 P 0, P 1 Short read locks Long write locks P 0, P 1, P 2 Long data-item read locks, short phantom locks Long write locks P 0, P 1, P 2, P 3 Long read locks Long write locks REAPEATABLE READ SERIALIZABLE ANSI 9/18/2018 3 Gray’s isolation degrees cs 262 a-F 18 Lecture-08 42

Generalized Isolation Levels 9/18/2018 cs 262 a-F 18 Lecture-08 43

Generalized Isolation Levels 9/18/2018 cs 262 a-F 18 Lecture-08 43

Direct Serialization Graph (DSG) Conflict Name Description DSG Directly write-depends T 1 writes value,

Direct Serialization Graph (DSG) Conflict Name Description DSG Directly write-depends T 1 writes value, then T 2 overwrites it T 1 ww T 2 Directly read-depends T 1 writes value, then T 2 reads it T 1 wr T 2 Directly anti-depends T 1 reads value, then T 2 writes it T 1 rw T 2 Example: T 1: W(A), W(B), W(C) T 2: R(B), W(C) T 3: W(B) R(C), W(B) wr T 1 T 2 ww 9/18/2018 rw wr T 3 ww cs 262 a-F 18 Lecture-08 44

Disallowing P 0 • Writes by T 1 are not overwritten by T 2

Disallowing P 0 • Writes by T 1 are not overwritten by T 2 while T 1 is uncommitted – Simplifies recovery from aborts, e. g. , » T 1 updates x, T 2 overwrites x , and then T 1 aborts » The system must not restore x to T 1’s pre-state » However, if T 2 aborts later, x must be restored to T 1’s prestate! – Serializes transactions based on their writes alone » all writes of T 2 must be ordered before or after all writes of T 1 G 0 only addresses this one 9/18/2018 cs 262 a-F 18 Lecture-08 45

G 0 • G 0: DSG contains a directed cycle consisting entirely of write-dependency

G 0 • G 0: DSG contains a directed cycle consisting entirely of write-dependency edges – Just ensure serialization on writes alone – More permissive than Degree 1 as allows concurrent transactions to modify same object • Example: T 1: W(A) W(B), … T 2: W(A), W(B), … ww T 1 9/18/2018 ww cs 262 a-F 18 Lecture-08 T 2 46

Disallowing P 1 • Writes of T 1 could not be read by T

Disallowing P 1 • Writes of T 1 could not be read by T 2 while T 1 is still uncommitted – It prevents a transaction T 2 from committing if T 2 has read the updates of a transaction that might later abort – It prevents transactions from reading intermediate modifications of other transactions – It serializes committed transactions based on their read/write-dependencies (but not their antidependencies), i. e. , » If transaction T 2 depends on T 1, T 1 cannot depend on T 2 9/18/2018 cs 262 a-F 18 Lecture-08 47

G 1 • G 1 a – Aborted reads: T 2 has read a

G 1 • G 1 a – Aborted reads: T 2 has read a value written by an aborted transaction T 1 • G 1 b – Intermediate Reads: Committed transaction T 2 has read an intermediate value written by transaction T 1 • G 1 c – Circular Information Flow: DSG contains a directed • cycle consisting entirely of dependency edges – Disallowing G 1 c ensures that if transaction T 2 is affected by transaction T 1, T 2 does not affect T 1 9/18/2018 cs 262 a-F 18 Lecture-08 48

Disallowing P 2 • T 1 cannot modify value read by T 2 –

Disallowing P 2 • T 1 cannot modify value read by T 2 – Precludes a transaction reading inconsistent data and making inconsistent updates 9/18/2018 cs 262 a-F 18 Lecture-08 49

G 2 • Just prevent transactions that perform inconsistent reads or writes from committing

G 2 • Just prevent transactions that perform inconsistent reads or writes from committing • G 2 – Anti-dependency Cycles: DSG contains a directed cycle with one or more anti-dependency edges • G 2 -item – Item Anti-dependency Cycles: DSG contains a directed cycle having one or more item-antidependency edges 9/18/2018 cs 262 a-F 18 Lecture-08 50

Generalized Isolation Levels Isolation levels G 0 G 1 G 2 -Item G 2

Generalized Isolation Levels Isolation levels G 0 G 1 G 2 -Item G 2 READ UNCOMMITTED NA NA READ COMITTED Not possible Possible REAPEATABLE READ Not possible Possible SERIALIZABLE Not possible 9/18/2018 cs 262 a-F 18 Lecture-08 51

Summary • Transactions, key abstractions on databases – Application defined sequence of operations on

Summary • Transactions, key abstractions on databases – Application defined sequence of operations on one or more databases that is atomic • Key challenge: trade performance to correctness – On one hand we want to interleave transactions to increase throughput – On the other hand we want to isolate transactions from each other • Solution: increase interleaving by providing – Multi-granularity locks – Relax the isolation semantics 9/18/2018 cs 262 a-F 18 Lecture-08 52

Were these good papers? • What were the authors’ goals? • What about the

Were these good papers? • What were the authors’ goals? • What about the evaluation / metrics? • Did they convince you that this was a good system /approach? • Were there any red-flags? • What mistakes did they make? • Does the system/approach meet the “Test of Time” challenge? • How would you review this paper today? 9/18/2018 cs 262 a-F 18 Lecture-08 53

Extra slides… 9/18/2018 cs 262 a-F 18 Lecture-08 54

Extra slides… 9/18/2018 cs 262 a-F 18 Lecture-08 54

Snapshot Isolation (SI) • A multiversion concurrency control mechanism was described in SIGMOD ’

Snapshot Isolation (SI) • A multiversion concurrency control mechanism was described in SIGMOD ’ 95 by H. Berenson, P. Bernstein, J. Gray, J. Melton, E. O’Neil, P. O’Neil – Does not guarantee serializable execution! • Supplied by Oracle DB for “Isolation Level Serializable” (also in Postgre. SQL before rel 9. 1) • Available in Microsoft SQL Server 2005 as “Isolation Level Snapshot”, and in Postgre. SQL (since rel 9. 1) as “Isolation Level Repeatable Read” 9/18/2018 cs 262 a-F 18 Lecture-08 55

Snapshot Isolation (SI) • Read of an item may not give current value •

Snapshot Isolation (SI) • Read of an item may not give current value • Instead, use old versions (kept with timestamps) to find value that had been most recently committed at the time the txn started – Exception: if the txn has modified the item, use the value it wrote itself • The transaction sees a “snapshot” of the database, at an earlier time – Intuition: this should be consistent, if the database was consistent before 9/18/2018 cs 262 a-F 18 Lecture-08 56

First committer wins (FCW) • T will not be allowed to commit a modification

First committer wins (FCW) • T will not be allowed to commit a modification to an item if any other transaction has committed a changed value for that item since T’s start (snapshot) • T must hold write locks on modified items at time of commit, to install them. – In practice, commit-duration write locks may be set when writes execute. – These simplify detection of conflicting modifications when T tries to write the item, instead of waiting till T tries to commit. 9/18/2018 cs 262 a-F 18 Lecture-08 57

Benefits of SI • Reading is never blocked, and reads don’t block writes •

Benefits of SI • Reading is never blocked, and reads don’t block writes • Avoids common anomalies – No dirty read – No lost update – No inconsistent read – Set-based selects are repeatable (no phantoms) • Matches common understanding of isolation: concurrent transactions are not aware of one another’s changes 9/18/2018 cs 262 a-F 18 Lecture-08 58

Is every execution serializable? • For any set of txns, if they all run

Is every execution serializable? • For any set of txns, if they all run with Two Phase Locking, then every interleaved execution is serializable • For some sets of txns, if they all run with SI, then every execution is serializable – Eg the txns making up TPC-C • For some sets of txns, if they all run with SI, there can be non-serializable executions – Undeclared integrity constraints can be violated 9/18/2018 cs 262 a-F 18 Lecture-08 59