Transaction Management and Concurre cy Overview R G

  • Slides: 26
Download presentation
Transaction Management and Concurre cy Overview R & G Chapter 16 -17 There are

Transaction Management and Concurre cy Overview R & G Chapter 16 -17 There are three side effects of acid. Enhanced long term memory, decreased short term memory, and I forget the third. - Timothy Leary

Administrivia • Homework 3 due today by end of class period • Homework 4

Administrivia • Homework 3 due today by end of class period • Homework 4 available on class website – Due date: April 10 (after Spring Break) • Midterm 2 is next class! Thu 3/22 – In class, covers lectures 10 -17 – Review will be held Wed 3/21 7 -9 pm 306 Soda Hall

Components of a DBMS transaction query Query Compiler Execution Engine Data Definition We are

Components of a DBMS transaction query Query Compiler Execution Engine Data Definition We are here Transaction Manager Logging/Recovery Schema Manager Concurrency Control Buffer Manager LOCK TABLE Storage Manager BUFFERS BUFFER POOL DBMS: a set of cooperating software modules

Correctness criteria: The ACID properties • • A tomicity: All actions in the Xact

Correctness criteria: The ACID properties • • A tomicity: All actions in the Xact happen, or none happen. C onsistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent. • I solation: Execution of one Xact is isolated from that of other Xacts. • D urability: If a Xact commits, its effects persist. We are here

Review - Definitions • Transaction - a sequence of read and write operations (read(A),

Review - Definitions • Transaction - a sequence of read and write operations (read(A), write(B), …) – DBMS’s abstract view of a user program • Serial schedule: Schedule that does not interleave the actions of different transactions. • Equivalent schedules: For any database state, the effect of executing the first schedule is identical to the effect of executing the second schedule. • Serializable schedule: A schedule that is equivalent to some serial execution of the transactions. (Note: If each transaction preserves consistency, every serializable schedule preserves consistency. )

Anomalies with interleaved execution • Reading Uncommitted Data (“WR”, dirty reads): – T 2

Anomalies with interleaved execution • Reading Uncommitted Data (“WR”, dirty reads): – T 2 reads a value A that T 1 wrote but didn’t commit – e. g: T 1 moves $100 from account B to account A T 2 adds 6% interest to account A $1100 + $66 + 900 = $2066 total after T 1 & T 2 A=1000 A=A+100=1100 T 1: T 2: R(A), W(A), B=1000 ->A=1166 B=1000 -100 A=? =900 B=1000 R(B), W(B), Abort R(A), W(A), C A=1100*1. 06 A=1166 B=1000 1166+1000 = 2166 -> Bank is out $106 because T 2 read T 1’s A value before it was done!

Anomalies with interleaved execution • Unrepeatable Reads (“RW” Conflicts): – T 1 reads a

Anomalies with interleaved execution • Unrepeatable Reads (“RW” Conflicts): – T 1 reads a value A that is then written by T 2 – e. g. T 1 and T 2 place orders for books, and A represents the quantity in stock (1) A=1 T 1: T 2: A=0 R(A), A=-1 R(A), W(A), C A=1 A=0 Hopefully T 1 gets an error if there is an integrity constraint!

Anomalies with interleaved execution • Overwriting Uncommitted Data (“WW”, lost update): – T 2

Anomalies with interleaved execution • Overwriting Uncommitted Data (“WW”, lost update): – T 2 overwrites a write by T 1 – e. g. T 1 assigns seat A to passenger 1 and seat B to passenger 2 T 2 assigns seat A to passenger 3 and seat B to passenger 4 B=2 A=1 T 1: T 2: W(A), A=3 B=2 W(B), C W(A), W(B), C A=3 B=4 -> Passenger 1’s partner is sitting with passenger 3!

How to prevent anomalies? Locks! • Database allows “objects” to be locked – “object”

How to prevent anomalies? Locks! • Database allows “objects” to be locked – “object” might be entire database, file, page, tuple • Two kinds of locks: – “Shared” or “Read” Lock • No one else is allowed to write the object if you have this – “Exclusive” or “Write” Lock • No one else is allowed to read or write the object

Locks are not enough • If lock/unlock objects right away, anomalies are still possible

Locks are not enough • If lock/unlock objects right away, anomalies are still possible – e. g. The unrepeatable read example T 1 obtain Share. Lock(A) T 1: T 2: T 1 release Share. Lock(A) T 1 obtain Share. Lock(A) R(A), W(A), C T 2 obtain Exclusive. Lock. L(A) T 2 release Exclusive. Lock(A)

Locks are not enough • Idea: Two Phase Locking – In a transaction, #locks

Locks are not enough • Idea: Two Phase Locking – In a transaction, #locks • only acquire locks in one phase • only release locks in a second phase Time • once one lock has been released, can never acquire another lock during transaction T 1 obtain Exclusive. Lock(A) T 1: T 2: T 1 release lock(A) R(A), W(A), …. R(A), W(A), C T 2 waits to obtain Exclusive. Lock(A) T 2 obtains Exclusive. Lock(A)

Lock-Based Concurrency Control Two-phase Locking (2 PL) Protocol: – Each Xact must obtain: •

Lock-Based Concurrency Control Two-phase Locking (2 PL) Protocol: – Each Xact must obtain: • • – – – a S (shared) lock on object before reading, and an X (exclusive) lock on object before writing. If an Xact holds an X lock on an object, no other Xact can get a lock (S or X) on that object. System can obtain these locks automatically Two phases: acquiring locks, and releasing them • • No lock is ever acquired after one has been released “Growing phase” followed by “shrinking phase”. • Lock Manager keeps track of request for locks and grants locks on database objects when they become available.

Strict 2 PL • 2 PL allows only serializable schedules but is subjected to

Strict 2 PL • 2 PL allows only serializable schedules but is subjected to cascading aborts. • Example: rollback of T 1 requires rollback of T 2! T 1 obtain Exclusive. Lock(A) T 1: T 2: T 1 release Exclusive. Lock(A) R(A), W(A), Abort R(A), W(A), R(B), W(B) T 2 obtains Exclusive. Lock(A) • T 2 read T 1’s value of A, so it must also be aborted. To avoid Cascading aborts, use Strict 2 PL

Strict 2 PL (cont) • Strict Two-phase Locking (Strict 2 PL) Protocol: – Same

Strict 2 PL (cont) • Strict Two-phase Locking (Strict 2 PL) Protocol: – Same as 2 PL, except: – All locks held by a transaction are released only when the transaction completes #locks vs • Advantage: no other transaction even reads anything you write until you commit. – e. g a transaction will only read committed data. • Disadvantage: transactions end up waiting. – e. g. inserts may lock a whole table • Why? Because of Phantom problem

The Phantom Problem • Even Strict 2 PL (on individual rows) will not assure

The Phantom Problem • Even Strict 2 PL (on individual rows) will not assure serializability: • Consider T 1 – “Find oldest sailor with Rating = 1” • T 1 sees 2 different answers, even though it did no updates itself. Time T 1 1 Obtain Share Lock on all existing Sailors with rating = 1 2 Compute oldest sailor (age = 71) T 2 3 Obtain Exclusive Lock on new Sailor tuple 4 Insert Sailor age 96 rating 1 5 Commit 6 Compute oldest sailor (age = 96)

Transactions in SQL • A new transaction is started with first SQL statement issued

Transactions in SQL • A new transaction is started with first SQL statement issued by a user SELECT S. SID, R. BID FROM SAILORS S, RESERVES R WHERE S. SID = R. SID • All statements from that point on appear in the same transaction UPDATE SAILORS SET RATING = RATING - 1 UPDATE RESERVES SET DATE = DATE+1 • A user can use commands COMMIT or ROLLBACK to explicitly end a transaction and start a new one COMMIT • Any new statements that follow will occur in a new transaction

Transactions in SQL • Because of performance, some anomalies might be acceptable for some

Transactions in SQL • Because of performance, some anomalies might be acceptable for some applications – Internet apps in particular! • SQL 92 supports different “Isolation Levels” for a transaction (Lost Update not allowed at any level) Isolation Level Dirty Read Unrepeatable Read Phantom Problem Read Uncommitted Maybe Read Committed No Maybe Repeatable Reads No No Maybe Serializable No No No

Aborting a Transaction (i. e. , Rollback) • If an xact Ti aborted, all

Aborting a Transaction (i. e. , Rollback) • If an xact Ti aborted, all actions must be undone. • To undo actions of an aborted transaction, DBMS maintains log which records every write. • Log is also used to recover from system crashes (which can occur as either hardware or software failure) – All active Xacts at time of crash are aborted when system comes back up.

The Database Log XID 1: REDO TID 1 <data> REDO TID 2 <data> COMMIT

The Database Log XID 1: REDO TID 1 <data> REDO TID 2 <data> COMMIT XID 2: REDO TID 3 <data> … Buffer Data Page • Updates are logged in the database log. • Log consists of “records” that are written sequentially. – Typically chained together by Xact id – Log is often archived on stable storage. – And backed up to even more stable storage (like tape!) Log Data Pages Log

The Database Log • Write-Ahead Logging protocol – Log record must go to disk

The Database Log • Write-Ahead Logging protocol – Log record must go to disk before the changed page! – All log records for a transaction (including its commit record) must be written to disk before the transaction is considered “Committed”. Update TID 2 <data> Update TID 1 <data> Buffer Data Page XID 1: REDO TID 1 <data> REDO TID 2 <data> COMMIT XID 2: REDO TID 3 <data> … Log 3 SQL: COMMIT XID 1 2 1 Data Pages Log

The Log • Log records are either UNDO or REDO records – Depends on

The Log • Log records are either UNDO or REDO records – Depends on the Buffer manager • UNDO required if: Buffer mgr allows uncommitted data to overwrite stable version of committed data (STEAL buffer management). • REDO required if: xact can commit before all its updates are on disk (NO FORCE buffer management). • The following actions are recorded in the log: – if Ti writes an object, write a log record with: • If UNDO required need “before image” • IF REDO required need “after image”. – Ti commits/aborts: a log record indicating this action.

How Professor Roth made some traders very unhappy at market open… • Log on

How Professor Roth made some traders very unhappy at market open… • Log on disk can fill up quickly for active DBMS • DBMS duplexes between two logs – Writes DB log to one – While archiving the other – When DB log is full, reverse • Must STOP all DBMS activity while you reverse XID 1: REDO TID 1 <data> REDO TID 2 <data> COMMIT XID 2: REDO TID 3 <data> … Log 1 Log 2

How Professor Roth made some traders very unhappy at market open… • Stock DB

How Professor Roth made some traders very unhappy at market open… • Stock DB at market open is VERY busy • We wanted to make sure there was an empty log just before market open • I wrote a script to ‘switch’ logs 5 mins before market open If (not already writing a log) Then switch logs Write the old log to tape • Except I forgot a comma, so it was more like this: Switch logs; write the old log to tape. . . and because the DB was writing out its log, the DB hung until it was done writing to tape! XID 1: REDO TID 1 <data> REDO TID 2 <data> COMMIT XID 2: REDO TID 3 <data> … Log I’m waiting… Log 1 Log 2

(Review) Goal: The ACID properties • • A tomicity: All actions in the Xact

(Review) Goal: The ACID properties • • A tomicity: All actions in the Xact happen, or none happen. C onsistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent. • I solation: Execution of one Xact is isolated from that of other Xacts. • D urability: If a Xact commits, its effects persist. What happens if system crashes between commit and flushing modified data to disk ?

D Durability - Recovering From a Crash • Three phases: – Analysis: Scan the

D Durability - Recovering From a Crash • Three phases: – Analysis: Scan the log (forward from the most recent checkpoint) to identify all Xacts that were active at the time of the crash. – Redo: Redo updates as needed to ensure that all logged updates are in fact carried out and written to disk. – Undo: Undo writes of all Xacts that were active at the crash, working backwards in the log. • At the end – all committed updates and only those updates are reflected in the database. • Some care must be taken to handle the case of a crash occurring during the recovery process!

Summary • Concurrency control and recovery are among the most important functions provided by

Summary • Concurrency control and recovery are among the most important functions provided by a DBMS. • Concurrency control is automatic – System automatically inserts lock/unlock requests and schedules actions of different Xacts – Property ensured: resulting execution is equivalent to executing the Xacts one after the other in some order. • Write-ahead logging (WAL) and the recovery protocol are used to: 1. undo the actions of aborted transactions, and 2. restore the system to a consistent state after a crash.