3 Transaction processing concepts What is a transaction

  • Slides: 37
Download presentation
3. Transaction processing concepts What is a transaction? • Logical unit of work •

3. Transaction processing concepts What is a transaction? • Logical unit of work • Execution of a program that accesses or changes the database Fundamental property of a database: • The data are shared among several users and applications concurrent processing, possibly conflicting interests. Adv. DB-3 J. Teuhola 2015 64

Transaction viewpoints • A transaction should lead the database from one consistent state to

Transaction viewpoints • A transaction should lead the database from one consistent state to another. • Partial execution not allowed (principle: all or nothing) • Concurrent access to data by multiple transactions should be supported. • Transactions may end prematurely, due to system (hardware/software) failure; recovery mechanisms are needed. Adv. DB-3 J. Teuhola 2015 65

Categories of database processing • OLTP = On-Line Transaction Processing – Needed for the

Categories of database processing • OLTP = On-Line Transaction Processing – Needed for the everyday functioning of the organization’s main activities, accessing the ’operative’ databases – Short transactions and response times • OLAP = On-Line Analytic Processing – – Decision support, data mining Long read-only transactions ’Data warehouse’ The data need not be exactly up-to-date; refreshed periodically Adv. DB-3 J. Teuhola 2015 66

Interleaved execution • Concurrent transactions may be executed in an interleaved fashion. • Transactions

Interleaved execution • Concurrent transactions may be executed in an interleaved fashion. • Transactions consist of steps, between which the data may be inconsistent. • Arbitrary interleaving of steps will cause problems (interference): – Lost update problem – Temporary update problem (‘dirty read’) – Incorrect summary problem Adv. DB-3 J. Teuhola 2015 67

Lost update problem Example: Bank account, two parallel withdrawals Time Transaction 1: Withdraw 200

Lost update problem Example: Bank account, two parallel withdrawals Time Transaction 1: Withdraw 200 from account 123 Transaction 2: Withdraw 100 from account 123 bal = read_balance(123) = 1000 € bal = bal – 200 € bal = read_balance(123) = 1000 € bal = bal - 100 € write_balance(123, bal) = 800 € write_balance(123, bal) = 900 € • The first update is lost. Note that the transactions are not aware of each other’s internal program variables (bal). Adv. DB-3 J. Teuhola 2015 68

Temporary update (‘dirty read’) problem Example transactions on bank accounts: Time Transaction 1 Transfer

Temporary update (‘dirty read’) problem Example transactions on bank accounts: Time Transaction 1 Transfer 100 € from account 123 to account 789 Transaction 2 Withdraw 200 € from account 123 bal = read_balance(123)=1000€ bal = bal - 100€ write_balance(123, bal)=900€ bal = read_balance(123) = 900€ bal = read_balance(789)=2000€ abort; recover_balance(123)=1000€ bal = bal - 200 € write_balance(123, bal)=700 € // Should be 800 € Adv. DB-3 J. Teuhola 2015 69

Incorrect summary problem Time Transaction 1 Sum of balances Transaction 2 Transfer 100 €

Incorrect summary problem Time Transaction 1 Sum of balances Transaction 2 Transfer 100 € from 789 to 123 sum = 0 bal = read_balance(123) = 1000 € sum = sum + bal = 1000 € bal = read_balance(456) = 2000 € sum = sum + bal = 3000€ bal = read_balance(789) = 3000 € bal = bal – 100 € write_balance(789, bal) = 2900 € bal = read_balance(123) = 1000 € bal = bal + 100€ write_balance(123, bal) = 1100 € bal = read_balance(789) = 2900 € sum = sum + 2900 € = 5900 € // Should be 6000 € Adv. DB-3 J. Teuhola 2015 70

Reasons for recovery • DBMS must ensure that each transaction is either (1) executed

Reasons for recovery • DBMS must ensure that each transaction is either (1) executed in totality (all operations completed successfully; updates made permanent), or (2) has no effect on the database contents nor on other transactions. • In case of failure, the DBMS must roll back the (partially executed) transaction so that (2) holds. Adv. DB-3 J. Teuhola 2015 71

Types of failures • Hardware/software failure: contents of main memory may be lost. •

Types of failures • Hardware/software failure: contents of main memory may be lost. • Transaction error: e. g. illegal operation or user interrupt. • Logical error: Violation of database consistency constraints. • Concurrency control error: Interference of transactions, e. g. deadlock. • Disk failure, accidents, etc. : Some data are usually lost. Adv. DB-3 J. Teuhola 2015 72

Transaction events • • begin_transaction read/write commit: successful end; fix the changes rollback (abort):

Transaction events • • begin_transaction read/write commit: successful end; fix the changes rollback (abort): unsuccessful ending undo effects of a failed transaction redo effects of a successful transaction end_transaction Adv. DB-3 J. Teuhola 2015 73

System log • On disk (permanent storage) • Enables recovery from failures • Stores

System log • On disk (permanent storage) • Enables recovery from failures • Stores data about transaction states and performed updates (before-/after-images) • Transactions identified by a unique id. • Log entries: start, [read, ] write, commit, abort • Undo needs before-image (old value) • Redo needs after-image (new value) Adv. DB-3 J. Teuhola 2015 74

Commit point • Database operations performed successfully • Effects recorded in the log. •

Commit point • Database operations performed successfully • Effects recorded in the log. • At commit the log page in buffer must be forcewritten to disk. • Writing other buffer pages may be postponed. • At failure: – Uncommitted transactions are undone. – Committed transactions may have to be redone. Adv. DB-3 J. Teuhola 2015 75

Checkpoint • Effects of write operations of committed transactions are forced to disk. •

Checkpoint • Effects of write operations of committed transactions are forced to disk. • A transaction need not be redone at failure, if its commit-entry in the log is before the last checkpoint. • Checkpoint is taken periodically. Adv. DB-3 J. Teuhola 2015 76

Checkpoint actions • Suspend execution. • Force-write committed updates. • Write the checkpoint info

Checkpoint actions • Suspend execution. • Force-write committed updates. • Write the checkpoint info to the log: – active transactions – pointers to the first and last log entries of transactions (to speed-up recovery) • Resume execution. Adv. DB-3 J. Teuhola 2015 77

Transaction processing scenario Main memory Program 1 Program 2 DBMS manages DB buffer Log

Transaction processing scenario Main memory Program 1 Program 2 DBMS manages DB buffer Log buffer ‘Page frames’ Force-write at checkpoint DB Disk pages Adv. DB-3 Force-write at commit Log pages J. Teuhola 2015 Before-/ after-images of changed tuples 78

Desirable (ACID) properties of transactions • Atomicity: All or nothing. • Consistency: From one

Desirable (ACID) properties of transactions • Atomicity: All or nothing. • Consistency: From one consistent state of the database to another. • Isolation: Updates are invisible before commit; no need for cascading rollbacks. • Durability: Committed changes are not lost (responsibility of the recovery method), except in catastrophic failures. Adv. DB-3 J. Teuhola 2015 79

Levels of isolation 0: No overwrite of dirty reads. 1: No lost updates. 2.

Levels of isolation 0: No overwrite of dirty reads. 1: No lost updates. 2. No dirty reads & no lost updates. 3. ‘True isolation’: No dirty reads & no lost updates & repeatable read. Adv. DB-3 J. Teuhola 2015 80

Schedule • Assume a set of transactions {T 1, . . . , Tn}

Schedule • Assume a set of transactions {T 1, . . . , Tn} with ordered operations: Ti = <op 1>i ; <op 2>i ; . . . • A schedule S is an ordered set of operations, where each Ti-sequence is a subsequence of S. • Notice: Operations of different transactions may be interleaved. Adv. DB-3 J. Teuhola 2015 81

Example schedule • Operation sequences of three transactions: T 1: T 2: T 3:

Example schedule • Operation sequences of three transactions: T 1: T 2: T 3: r 1(x); r 1(y); w 1(x); c 1; r 2(y); w 2(y); c 2; r 3(x); r 3(y); w 3(x); c 3; • One possible interleaved schedule: r 1(x); r 2(y); r 1(y); w 1(x); r 3(y); w 2(y); w 3(x); c 2; c 1; c 3; Adv. DB-3 J. Teuhola 2015 82

Conflict of operations • Notation for operations: <op><trans-id>(<data item>) where <op> is one of

Conflict of operations • Notation for operations: <op><trans-id>(<data item>) where <op> is one of {r, w, c, a} = {read, write, commit, abort} • Conflict (= danger): Two operations <op 1>i(X) and <op 2>j(X) may cause trouble if i j and (<op 1> = w or <op 2> = w) Adv. DB-3 J. Teuhola 2015 83

Example of conflicts • Schedule: r 1(x); r 1(y); r 2(x); w 1(x); r

Example of conflicts • Schedule: r 1(x); r 1(y); r 2(x); w 1(x); r 2(y); w 2(x); c 1; c 2; • Conflicting pairs of operations: – r 1(x) and w 2(x) – r 2(x) and w 1(x) – w 1(x) and w 2(x) • Not in conflict: – r 1(y) and r 2(y) Adv. DB-3 J. Teuhola 2015 84

Complete schedule S is a complete schedule of T 1, T 2, . .

Complete schedule S is a complete schedule of T 1, T 2, . . . , Tn, if (1) S contains operations of T 1, T 2, . . . , Tn (but no others). (2) Operation order of each Ti is preserved. (3) Order of operations is fixed for each conflicting pair; otherwise ordering may be partial. (4) Commit/abort is the last operation of each Ti. Adv. DB-3 J. Teuhola 2015 85

Committed projection • Committed projection of S = All operations of S belonging to

Committed projection • Committed projection of S = All operations of S belonging to committed transactions. • Reason for this concept: New transactions appear all the time; difficult to find a moment when no active transactions exist. Adv. DB-3 J. Teuhola 2015 86

Recoverable schedule • Transaction T 2 that reads X does not commit until T

Recoverable schedule • Transaction T 2 that reads X does not commit until T 1 that wrote X (before the read) has committed. • Explanation: T 2 can be rolled back as long as it has not committed. If T 1 aborts, T 2 should be rolled back to avoid consequences of dirty read. Adv. DB-3 J. Teuhola 2015 87

Cascading rollback Cascade phenomenon: T 2 has to be aborted because T 1 aborted;

Cascading rollback Cascade phenomenon: T 2 has to be aborted because T 1 aborted; T 3 has to be aborted because T 2 aborted; . . . Prevention: Read an item only after the writing transaction has committed (cascadeless schedule). Strict schedule: Read/write X only after the last writer transaction of X has committed. Advantage: Simpler undo. Disadvantage: Lower level of concurrency Adv. DB-3 J. Teuhola 2015 88

Serializability Aim: • Correct (but not necessarily serial) schedules for interfering transactions Means: •

Serializability Aim: • Correct (but not necessarily serial) schedules for interfering transactions Means: • Sufficient isolation Serial schedule: • All operations of each transaction Ti are executed consecutively, without interleaving. Adv. DB-3 J. Teuhola 2015 89

Serializability (cont. ) Axiom: For independent transactions every serial schedule is correct (n! alternatives

Serializability (cont. ) Axiom: For independent transactions every serial schedule is correct (n! alternatives for n transactions). Problem: Limited concurrency; if one transaction waits for I/O, another cannot take over; the processor is idle. Serializable schedule: Equivalent to some serial schedule (same result, though interleaved execution); means logical correctness. Adv. DB-3 J. Teuhola 2015 90

Forms of schedule equivalence • Result equivalence: May be accidental; it should hold for

Forms of schedule equivalence • Result equivalence: May be accidental; it should hold for all possible database states. • Conflict equivalence: The order of any two conflicting operations is the same in both schedules. • View equivalence: A read operation reads the result of the same write operation (or none) in both schedules. Adv. DB-3 J. Teuhola 2015 91

Forms of serializability Corresponding to the forms of equivalence: • Conflict serializability: Simple testing

Forms of serializability Corresponding to the forms of equivalence: • Conflict serializability: Simple testing algorithm • View serializability: Less restrictive than conflict serializability. Testing is an NP-complete problem. Note. Every conflict-serializable schedule is also view-serializable, but not vice versa. Adv. DB-3 J. Teuhola 2015 92

Testing conflict serializability 1. For each transaction, create a node in a precedence graph.

Testing conflict serializability 1. For each transaction, create a node in a precedence graph. 2. For each pair of conflicting operations <op>Ti(X) and <op>Tj(X), occurring in this order, create an edge from Ti to Tj. 3. The schedule is serializable if the precedence graph has no cycles. Adv. DB-3 J. Teuhola 2015 93

Serializable schedule equivalent serial schedule 1. Derive the precedence graph of transactions as above.

Serializable schedule equivalent serial schedule 1. Derive the precedence graph of transactions as above. 2. Perform a topological sort of the graph nodes. 3. The resulting sequence is the solution (which may not be unique, in general). Adv. DB-3 J. Teuhola 2015 94

Example of a precedence graph Schedule: r 1(x), r 2(y), w 1(x), r 3(y),

Example of a precedence graph Schedule: r 1(x), r 2(y), w 1(x), r 3(y), w 3(x), w 3(y), w 2(y), c 1, c 2, c 3 1 2 y x y 3 Adv. DB-3 J. Teuhola 2015 The schedule is not serializable, due to the cycle. 95

Problems of serializability • Difficult to determine the order of operations beforehand. • Scheduling

Problems of serializability • Difficult to determine the order of operations beforehand. • Scheduling must be done on-the-fly, based on waiting, priorities, system load, etc. • Testing serializability afterwards may be too optimistic (cf. optimistic concurrency control). • Serializability can be applied only to the committed projection. Adv. DB-3 J. Teuhola 2015 96

Solution to maintaining serializability • Control the interleaving of operations by obeying rules (a

Solution to maintaining serializability • Control the interleaving of operations by obeying rules (a protocol) that guarantee serializability without checking it. (E. g. two-phase locking protocol, see later) • Conclusion: Serializability theory helps to gain better understanding of protocols, to obtain correct interleaving and improved concurrency. Adv. DB-3 J. Teuhola 2015 97

Debit-credit transactions • Less restrictive form of equivalence • Addition to and subtraction from

Debit-credit transactions • Less restrictive form of equivalence • Addition to and subtraction from an item can be done in any order (commutativity). • A schedule is correct if <read, add/subtract, write> triples are not interrupted. Adv. DB-3 J. Teuhola 2015 98

Granularity • To be decided: Unit of data considered in scheduling • A smaller

Granularity • To be decided: Unit of data considered in scheduling • A smaller unit enables a higher level of concurrency (but involves also more effort to manage) • A (disk) page is often the simplest unit of operation. Adv. DB-3 J. Teuhola 2015 99

Transaction support in SQL • Implicit BEGIN_TRANSACTION • Explicit COMMIT or ROLLBACK • SET

Transaction support in SQL • Implicit BEGIN_TRANSACTION • Explicit COMMIT or ROLLBACK • SET TRANSACTION characteristics: – Access mode: READ ONLY or READ WRITE – Diagnostic area size: Number of feedback conditions held simultaneously. – Isolation level (from weakest to strongest): • • READ UNCOMMITTED (allows dirty reads) READ COMMITTED (allows non-repeatable reads) REPEATABLE READ (allows ‘phantoms’) SERIALIZABLE (actually stronger than defined earlier) Adv. DB-3 J. Teuhola 2015 100