Transactions or Concurrency Control Introduction A program which

  • Slides: 36
Download presentation
Transactions or Concurrency Control

Transactions or Concurrency Control

Introduction • A program which operates on a DB performs 2 kinds of operations:

Introduction • A program which operates on a DB performs 2 kinds of operations: – Access to the Database (Read/Write) – Memory operations sailors Reserves Main Memory DISK

Read + Memory operations Read operation Memory operations

Read + Memory operations Read operation Memory operations

Introduction • When dealing with concurrency control, we are only interested in operations on

Introduction • When dealing with concurrency control, we are only interested in operations on the DB: Read or Write • So, we deal with “Abstractions” of programs • An “Abstraction” of a program is a series of operations the program performs on a DB • We also call this a Transaction (or sometimes, a Program)

Example Transaction 1 Read(C) Read(A) Write(A) Read(B) Write(C) R 1(A) W 1(A) R 1(B)

Example Transaction 1 Read(C) Read(A) Write(A) Read(B) Write(C) R 1(A) W 1(A) R 1(B) W 1(C) or operations = Read(A), … Items= A, B, C

Definitions • Schedule: The order of execution of operations of 2 or more transactions.

Definitions • Schedule: The order of execution of operations of 2 or more transactions. Schedule S 1 Transaction 2 R(A) W(A) R(B) W(C) R(B) W(B) Time R(C)

 • When a single Transaction is run, there is no Concurrency Control problem

• When a single Transaction is run, there is no Concurrency Control problem • When there are more, problems might occur • Example: 2 programs, each adding 100$ to an account A

Example: 2 programs, each adding 100$ to an account A • If they are

Example: 2 programs, each adding 100$ to an account A • If they are run one after the other: Transaction 1 Transaction 2 Time R(A) W(A) No problem!

Example: 2 programs, each adding 100$ to an account A • If they are

Example: 2 programs, each adding 100$ to an account A • If they are run in parallel: Transaction 1 Transaction 2 R(A) W(A) Problem! Why? Time R(A) W(A)

Definitions • Serial Schedule: A schedule in which the transactions are performed one after

Definitions • Serial Schedule: A schedule in which the transactions are performed one after the other in a serial manner. Read(A) Write(A) Read(B) Write(B) Read(C) Write(C) Read(B) Write(B)

Schedules • A schedule is “correct” if it gives the same result as a

Schedules • A schedule is “correct” if it gives the same result as a serial schedule for any calculation. • Examples: Read(A) Write(A) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) Read(B) Write(B)

Schedules • Example for a “correct” schedule: Read(A) Write(A) Read(B) Write(B) Will always give

Schedules • Example for a “correct” schedule: Read(A) Write(A) Read(B) Write(B) Will always give the same result as Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) And this will never cause an interleaving problem

 • We would thus like to know when 2 schedules are equivalent •

• We would thus like to know when 2 schedules are equivalent • Equivalent: Will give the same result for any input • How do you check for equivalence? • Naïve approach: Check the output for all inputs • This is clearly impossible • So, we need a simple set of rules to tell us if 2 schedules are equivalent

 • Schedules are View Equivalent if: 1. 2. 3. 4. They consist of

• Schedules are View Equivalent if: 1. 2. 3. 4. They consist of the same transactions. If Tk reads an initial value for A in S 1, then Tk will also read an initial value for A in S 2 (“initial”=A has not been written to yet). If Tk reads a value of A written by Ti in S 1, then Tk will also read a value of A written by Ti in S 2. If Ti writes a final value for A in S 1, then Ti writes a final value for A in S 2. What are the violations of the following schedules to view-equivalence? T 1 T 2 T 3 R 1(A) Schedule S 1 W 1(A) R 1(C) W 1(C) R 1(A) W 1(A) R 1(C) W 1(C) R 2(C) W 2(C) R 2(B) W 2(B) R 3(C) W 3(C) R 2(C) W 2(C) R 2(B) W 2(B) Schedule S 2 R 3(C) W 3(C)

Are these schedules View-Equivalent? Schedule S 1 T 1 R 1(A) W 1(A) R

Are these schedules View-Equivalent? Schedule S 1 T 1 R 1(A) W 1(A) R 1(C) W 1(C) T 2 Schedule S 2 T 3 R 1(A) R 2(C) W 2(C) R 2(B) W 2(B) T 1 R 3(C) W 1(A) R 1(C) W 1(C) T 2 T 3 R 2(C) W 2(C) R 2(B) W 3(C) R 3(C)

Are these schedules View-Equivalent? Schedule S 1 T 1 R 1(A) W 1(A) R

Are these schedules View-Equivalent? Schedule S 1 T 1 R 1(A) W 1(A) R 1(C) W 1(C) T 2 Schedule S 2 T 3 R 2(C) W 2(C) R 2(B) W 2(B) T 1 R 1(A) W 3(C) R 3(C) W 1(A) R 1(C) W 1(C) T 2 T 3 R 2(C) W 2(C) R 2(B) W 3(C) R 3(C)

Are these schedules View-Equivalent? Schedule S 1 T 1 R 1(A) W 1(A) R

Are these schedules View-Equivalent? Schedule S 1 T 1 R 1(A) W 1(A) R 1(C) W 1(C) T 2 Schedule S 2 T 3 R 1(A) R 2(C) W 2(C) R 2(B) W 2(B) T 1 W 1(A) R 3(C) W 3(C) R 1(C) W 1(C) T 2 T 3 R 2(C) W 2(C) R 2(B) W 2(B) R 3(C) W 3(C)

Are these schedules View-Equivalent? Schedule S 1 T 1 R 1(A) W 1(A) R

Are these schedules View-Equivalent? Schedule S 1 T 1 R 1(A) W 1(A) R 1(C) W 1(C) T 2 Schedule S 2 T 3 R 1(A) R 2(C) W 2(C) R 2(B) W 2(B) T 1 R 3(C) W 1(A) R 1(C) W 1(C) T 2 T 3 R 2(C) W 2(C) R 2(B) W 2(B) R 3(C) W 3(C)

View-Equivalence • If 2 schedules are view-equivalent: – The same transactions will read the

View-Equivalence • If 2 schedules are view-equivalent: – The same transactions will read the same values in both schedules – Therefore, they will also write the same values – This is true for any calculation

Definitions • A schedule is View-Serializable if it is View-Equivalent to some Serial schedule.

Definitions • A schedule is View-Serializable if it is View-Equivalent to some Serial schedule. S 1 S 2 Read(A) Read(C) Write(A) Read(B) Write(C) Write(B) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) Read(C) Write(C) Read(B) Write(B) Schedule S 1 is view-equivalent to a serial schedule (S 2), so it is View -Serializable

 • What is the Serial Schedule that S 1 is equivalent to? S

• What is the Serial Schedule that S 1 is equivalent to? S 2 S 1 R(A) W(A) R(C) W(C) R(C) W(B) R(A) W(A) R(B) R(C) W(B) W(C) R(C) W(B) R(A) R(B) W(B)

 • What is the Serial Schedule that S 1 is equivalent to? S

• What is the Serial Schedule that S 1 is equivalent to? S 1 R(A) W(A) R(C) W(A) W(C) R(A) There is no Serial Schedule that S 1 is view equivalent to. In other words, S 1 is not View-Serializable

 • • • We already said that for any equivalent S 1, S

• • • We already said that for any equivalent S 1, S 2: If Tk reads a value of A written by Ti in S 1, then Tk will also read a value of A written by Ti in S 2. In simpler words: If in S 1 Read(A) in T 1 is “lower” than Write(A) in T 2, then this has to hold in S 2 too. And in a picture: S 1 Lower = later W(B) R(A) W(A) R(B) • What about Write(A) which is “lower” than Read(A)? And Write(A) which is “lower” than Write(A)? Do these also have to hold in an equivalent schedule?

 • Blind Write: A transaction performs a Blind Write of A if it

• Blind Write: A transaction performs a Blind Write of A if it writes A without reading it before. Blind Write • Read(A) Write(A) Read(C) Write(B) Assuming there are no Blind Writes, and S 2 is an equivalent serial schedule : 1. If Tk writes a value of A which was previously read by Ti in S 1, then this will happen in S 2 too. 2. If Tk writes a value of A which was previously written by Ti in S 1, then this will happen in S 2 too.

 • We want to show that if Write(B) in Ti is “lower” than

• We want to show that if Write(B) in Ti is “lower” than Read(B) in Tk then this has to happen in any equivalent serial schedule. • Suppose this is the case: R(B) S 1 R(B) W(B) S 2 W(B) R(B) • So, suppose this is the case: R(B) W(B) R(B) Blind write! S 1 W(B) R(B) S 2

Why is the No Blind Writes demand Necessary? R(B) Blind write! W(B) R(B) S

Why is the No Blind Writes demand Necessary? R(B) Blind write! W(B) R(B) S 1 No Blind write S 1 S 2 R(B) W(B) R(B) S 2 Bottom line: if there are no blind writes, If Tk writes a value of A which was previously read by Ti in S 1, then this will happen in any equivalent serial schedule

 • This can also be shown for two Write operations in the same

• This can also be shown for two Write operations in the same way. This leads us to the following definition: • There is a Conflict between 2 operations in different transactions, if at least one of them is a Write, and they are performed on the same item A. • According to what we showed, if there are no blind writes, the direction of the conflict (arrow) has to be kept in any equivalent serial schedule ! • So is there a view-equivalent serial schedule to S 1? S 1 R(A) W(A) R(B) W(B) R(C) Find the conflicts…

 • We can now define equivalence between schedules according to their conflicts: •

• We can now define equivalence between schedules according to their conflicts: • Schedules S 1, S 2 are Conflict Equivalent if they consist of the same transactions and the conflict arrows have the same directions. S 1 Conflict Equivalent: S 2 R(A) W(A) R(A) R(B) W(B) R(C) W(B) R(C)

 • Lemma: Conflict Equivalence => View Equivalence (this is true even if there

• Lemma: Conflict Equivalence => View Equivalence (this is true even if there are Blind Writes!) Schedules are Conflict Equivalent if: 1. 2. They consist of the same transactions. The conflict arrows have the same directions. Schedules are View Equivalent if: 1. 2. 3. 4. They consist of the same transactions. If Tk reads an initial value for A in S 1, then Tk will also read an initial value for A in S 2 (initial=A has not been written to). If Tk reads a value of A written by Ti in S 1, then Tk will also read a value of A written by Ti in S 2. If Ti writes a final value for A in S 1, then Ti writes a final value for A in S 2. Proof: We assume S 1 and S 2 are Conflict Equivalent. We need to prove 1 -4 from above.

 • Schedule S 1 is Conflict Serializable if it is Conflict. Equivalent to

• Schedule S 1 is Conflict Serializable if it is Conflict. Equivalent to some serial schedule S 2. • Conflict Serializable => View Serializable (directly from the Lemma). • The other direction is not necessarily true if there are Blind Writes: T 1 T 2 T 3 There is no serial R(A) schedule which is W(A) S 1 conflict. W(A) equivalent to s 1 W(A) S 2 T 1 R(A) W(A) T 2 T 3 W(A) But S 2 is serial and is viewequivalent to S 1

The precedence graph S 1 T 2 T 1 R(A) W(A) R(B) W(B) R(C)

The precedence graph S 1 T 2 T 1 R(A) W(A) R(B) W(B) R(C) T 2 Node for each transaction Edge from T 1 to T 2 if there is a conflict between T 1 and T 2 in which T 1 occurs first S 1 is conflict-serializable iff its precedence graph doesn’t contain a circular path

Which is conflict-Serializable? W(B) R(A) W(A) R(B) R(A) W(B) R(A) R(B) R(C) W(C) R(C)

Which is conflict-Serializable? W(B) R(A) W(A) R(B) R(A) W(B) R(A) R(B) R(C) W(C) R(C) R(A) W(B) R(A) R(B) W(B) R(C)

Locks • • – – Used in order to allow only serializable schedules. The

Locks • • – – Used in order to allow only serializable schedules. The principle: before performing a write/read on item A, a transaction asks for a lock on A. Only after getting the lock from the lock-manager can the transaction perform the read/write. 2 kinds of locks: 1. Shared lock: many transactions can hold a shared lock on the same item at the same time. 2. Exclusive lock: only one transaction can hold an exclusive lock on an item at any given time. In order to Read, a Shared Lock is needed. In order to Write, an Exclusive Lock is needed.

2 -Phase Locking (2 PL) • • A protocol (set of rules) which uses

2 -Phase Locking (2 PL) • • A protocol (set of rules) which uses locks to ensure only serializable schedules. The only additional rule: after a transaction has freed a lock it cannot get any new lock. This means every transaction will perform 2 phases: getting locks, and then releasing locks. At least one 2 PL => conflict serializability end of each T 1 T 2 T 3 R/W(A) arrow is a ‘Write’ R/W(A) R/W(B) R/W(C)

Recovering from crashes • • • Up until now we ignored the possibility of

Recovering from crashes • • • Up until now we ignored the possibility of a crash of a transaction. To handle such a case we remember Commit and Rollback. Problem: T 1 read a Consider this schedule: Notice that this schedule is Conflict Serializable! T 1 T 2 W(B) R(A) W(A) Crash!! R(A) R(B) R(C) T 1 finished so it commits T 2 rolls back W(B) W(C) R(C) value which T 2 wrote, and committed. The schedule is called “Not Recoverable”

Recovering from crashes • • • Solution: Commit only after all transactions which you

Recovering from crashes • • • Solution: Commit only after all transactions which you have read from have committed (assuming your are a transaction). Even more strict solution: Read an item only after all transactions which write this item have committed This leads to a new protocol: Strict 2 PL: Same rules as 2 PL with the addition that a transaction releases its locks only after it has committed. Strict 2 PL ensures recovering. Why? Good luck!