Concurrency Control Single User Versus Multiuser System Database

Concurrency Control Single User Versus Multi-user System: • Database system can be classified on the basis of the number of users who can use the system concurrently. • A DBMS is single user if at most one user at a time can use the system. • It is multi-user if many users can use the system and hence access the database concurrently. • E. g. Airline Reservation System.

Multiprogramming: • Allows computer to execute multiple programs or processes at the same time. • If only single CPU exists it can actually execute at most one process at a time. • Multiprogramming OS executes some commands from one process, then suspend that process and execute some commands from next process and so on. • A process is resumed at the point where it was suspended, whenever it gets its turn to use the CPU again.

Parallel Processing: • If the computer has multiple hardware processors parallel processing of multiple processes is possible. • All the database access operations are specified in the explicit BEGIN & END statements form one transaction.

Basic Database Access Operations: 1. read_item(x): • Reads a database item named x into a program variable. • Executing a read_item(x) command includes the following steps: 1. Finds the address of the disk block that contains item x. 2. Copy that disk block into a buffer in main memory. 3. Copy item x from the buffer to the program variable named x.

2. write_item(x): • Writes the value of a program variable x into the database item named x. • Executing a write_item(x) command includes the following steps: 1. Find the address of the disk block that contains item x. 2. Copy that disk block into a buffer in main memory. 3. Copy item x from the program variable named x into its correct location in the buffer. 4. Store location in the buffer. • Step 4 is one that actually updates the database on disk.

E. g. Of Transactions: T 1 Read_item(x); X: = x-n; Write_item(x); Read_item(y); Y: = y+n; Write_item(y); T 2 Read_item(x); x: = x+m; write_item(x);

• The read-set of a transaction is the set of all items that the transaction reads. • The write-set of a transaction is the set of all the items that the transaction writes. • E. g. In the above case, read set of T 1 is {x, y} and its write- set is also {x, y}. • read set of T 2 is {x} and its write- set is also {x}.

State Transition Diagram: Begin Transaction End Transaction Active Abort Partially Committed Abort Failed Terminated

Properties of Transaction: A – Atomicity: • Atomicity property of a transaction states that the operations in a transaction are either executed completely or not at all. C – Consistency: • The consistency property sates that the operation sequence of a transaction takes the database from any consistent state to another consistent state (correct, integrity).

I – Isolation: • Isolation property states that the intermediate states of transactions are not visible to other transactions. D – Durability: • Durability states that the effects of completed transactions are not lost due to hardware or software failures

Transactions – Isolation: • Isolation means that: – Multiple transactions running at the same time not impact each other’s execution. – Each user has the impression that he/she has exclusive access for the entire transaction. – All other transactions that happen at the same time should appear either as before or after it.

Schedules: • When transactions are executing concurrently in an interleaved fashion, then the order of execution of operations from various transactions is known as a schedule (or history).

Examples of Schedule: T 1 read_item(x) x: = x-n; T 2 read_item(x); x: = x+m; write_item(x); read_item(y); write_item(x); y: = y+n; Write_item(y); Schedule: r 1(x); r 2(x); w 1(x); r 1(y); w 2(x); w 1(y);

T 1 read_item(x) x: = x-n; Write_item(x) T 2 read_item(x) x: = x+m; write_item(x) Write_item(y) Schedule: r 1(x); w 1(x); r 2(x); w 2(x); r 1(y);

Conflicting Operations: • Two operations in a schedule are said to conflict, if they satisfy all three of the following conditions: 1. They belong to different transactions. 2. They access the same item x. 3. At least one of the operation is a write_item(x).

Serializability of Schedules: • Serializability of schedules is used to identify which schedules are correct when transaction executions have interleaving of their operations in the schedules. • A schedule can be: 1. Serial Schedule: 2. Non-serial Schedule:

Serial Schedules: T 1 Read_item(x); X: = x-n; Write_item(x); Read_item(y); Y: = y+n; Write_item(y); T 2 read_item(x); x: = x + m; write_item(y);

Non - Serial Schedules: T 1 Read_item(x); X: =x-n; T 2 read_item(x); X: =x+m; Write_item(x); Read_item(y); write_item(x); Y: =y+n; Write_item(y);

Concurrency control: – Takes care that transactions access database items (database, table, page, row, index key) such that the meaningful results are produced. – Produces a schedule of database operations from transactions running concurrently so the order of operations for each particular transaction is preserved.

Need for Concurrency Control : • If DBMS provides concurrency control support for transactions, users/programmers do not need to worry that there are other transactions running at the same time or not. • Problems can occur when concurrent transactions execute in an uncontrolled manner. • Isolation (+ Consistency) => Concurrency Control. • Multiple transactions may want to access and modify the same resources. • Whenever multiple processes share resources there is need to schedule the access.

Transaction Schedule – By Example: • Assume that Transaction T 1 has operations O 1 O 2 O 3 • Assume that Transaction T 2 has operations P 1 P 2 P 3 – O 1 O 2 P 1 O 3 P 2 P 3 is a schedule – O 1 P 1 O 3 P 2 P 3 O 2 is not a schedule – order is not preserved operation O 3 must be executed after O 2 within T 2

Serial Schedule: • Schedule is serial if all operations from one transaction are completed prior to beginning of another transaction. – Each serial schedule is considered correct since one transaction is independent of the other transactions – There is no overlapping of transactions

Serial Schedule – Examples and Main Problem: – In the example on the right: – Transactions T 1 and T 2 update totally different items of the database (X, Y) – Hence, T 1 and T 2 could have been executed concurrently (“in parallel”) – Serial schedules are always correct but do not use computer resources on optimal way (for concurrency and performance)

How To Improve Efficiency? Non-serial schedules : • Allow transactions to occur at the same time (concurrently). • Operations of one transaction can be executed before another transaction is committed. • Schedules where transactions occur concurrently are called non-serial schedules or concurrent schedules.

Non-Serial Schedule – Example and New Problems : • If operations are not “meaningfully” ordered, we can get unexpected results. • Typical problems with schedules: – Dirty read – Non-repeatable read – Phantom read

Dirty Read : • A Dirty Read occurs because transaction T 2 sees the uncommitted results of transaction T 1. – Transaction T 1 reads an item and updates it – Transaction T 2 reads updated item – Transaction T 1 might abort in the future (and its update would be annulled). – In meantime, transaction T 2 proceeds with the item that now has incorrect / uncommitted value. – Expected (good) behavior if the transactions were serialized: Once T 1 is aborted, T 2 will still use the old (valid, non-updated) value of the item.

Non-Repeatable Read : • A No repeatable Read occurs if transaction T 1 retrieves a different result from the each read. – Transaction T 1 reads an item. – Transaction T 2 reads and updates the same item. – Transaction T 1 reads the same item again, but now it has a new, modified value. – Expected (good) behavior if the transactions were serialized: If a transaction only reads (and does not modify) the item, each time the item is read, the same value will be obtained.

Phantom Read : • A Phantom Read occurs if transaction T 1 obtains a different result from each Select for the same criteria. – Transaction T 1 executes search on certain criteria and retrieve m items from a table. – Transaction T 2 inserts another item that would match the search criteria. – Transaction T 1 again executes search and now retrieves m+1 items from the table. – Expected (good) behavior if the transactions were serialized: The first and the second search within the same transaction will give the same result

Locking Technique for Concurrency Control: • Locking is very important in a multi-user DBMS. • Locking allows one user to work with a data item without another user changing the data item's value. • Locking is necessary for maintaining data integrity while concurrent users access database information.

Locks: • Lock is implemented as a variable associated to a data item. • Locks can be placed explicitly by the program, or implicitly by the DBMS. • Lock describes status of an item with respect to operations that can be performed on the item. • Generally there is one lock for each data item in the database. • Locks are used as a means of synchronizing the access by concurrent transactions to the database items.

Types of Locks: 1. Binary Locks: • A binary lock can have two states i. e. locked (1)and unlocked(0). • A distinct lock is associated with each data item X. • If the value of the lock on X is 1. item X cannot be accessed by a database operation that requests the item. • If the value of the lock on X is ), the item can be accessed when requested. • The current value (state) of the lock associated with item X as LOCK(X).

• Two operations, lock-item and unlock – item are used with binary locking. • A transaction requests access to an item X by first issuing a lock_item(x) operation. • If Lock(X) = 1, the transaction is forced to wait. • If Lock(X) = 0, it is set to 1 (the transaction locks the item), and the transaction is allowed to access item(X). • When the transaction is finished using the item, it issues an unlock_item(X) operation, which sets LOCK(X) to 0 so that X may be accessed by other transactions. • Each lock is a record with 3 fields: • <data item name, LOCK, locking transaction> plus a queue for transactions that are waiting to access the item.

• In a simple Binary Locking scheme, every transaction must obey the following rules: - 1. A transaction T must issue the operation lock_item(X) before any read_item(X) or Write_item(X) operations are performed in T. 2. A transaction T must issue the operation Unlock_item(X) after all read_item(X) and write_item(X) operations are completed in T. 3. A transaction T will not issue a lock_item(X) operation if it already holds the lock on item X. 4. A transaction T will not issue an unlock_item(X) operation unless it already holds the lock on item X.

2. Shared/ Exclusive (Read/Write) Locks: • Binary locking scheme is too restrictive for database items, because at most one transaction can hold a lock on the given item. • Several transactions can access the same item X if they all access X for reading purpose only. • However if a transaction is to write an item X, it must have exclusive access to X. • For this purpose, a different type of lock called a multiple mode lock is used.

• In shared/Exclusive or Read/Write locking scheme, there are three locking operations: • Read_lock(X), • Write_lock(X), • Unlock(X). • A lock associated with an item X, Lock(X), has 3 possible states: “read_locked”, “write_locked” or “unlocked”. • A read locked item is called Shared_locked, because other transactions are allowed to read the item, where as a write_locked item is called exclusive_locked, because a single transaction exclusively holds the lock on the item.

• Each record in the lock table will have four fields: • <data item name, lock, no. of records, locking transactions>. • To save space , the system maintains lock records only for locked items in the lock table. • The value (state) of lock is either read_lockes or write_locked.

• In shared / Exclusive locking scheme, a transaction must obey the following rules: - 1. A transaction T must issue the operation read_lock(X) or Write_lock(X) before any read_item(X) is performed in T. 2. A transaction T must issue the operation Write_lock(X) before any write_item(X) operation is performed in T. 3. A transaction T must issue the operation Unlock(X) after all read_item(X) and write_item(X) operations are completed in T. 4. A transaction T will not issue a read_lock(X) operation if it already holds a read (shared) lock or a write (exclusive) lock on item X. 5. A transaction T will not issue a write_lock(X) operatio 9 n if it already holds a read (shared) lock or write (exclusive) lock on an item X. 6. A transaction T will not issue an unlock(X) operation. Unless it already holds a read (shared) lock or a write (exclusive) lock on item X.

Conversion of Locks: • A transaction that already holds a lock on item X is allowed under certain conditions to convert the lock from one locked state to another. • This is called as lock conversion. • E. g. It is possible for a transaction T to issue a read_lock(X) and then later upgrade the lock by issuing a write_lock(X) operation. • If T is the only transaction holding a read lock in X at the time it issues the write_lock(X) operation, the lock can be upgraded; otherwise the transaction must wait.

• It is also possible for a transaction T to issue a write_lock(X) and then later on to downgrade the lock by issuing a read_lock(X) operation.

Two Phase Locking: • Binary Locks/ & Read/Write locks in transactions, does not guarantee serializability of schedules on it own. • To guarantee serializability, an additional protocol, concerning the positioning of locking and unlocking operations in every transactions. • Such a transaction can be divided into 2 phases: 1. Expanding /Growing Phase. 2. Shrinking Phase.

Expanding Phase: • In this phase, new locks on items can be acquired but none can be released. Shrinking Phase: • In this phase existing locks can be released but no new locks can be acquired. • If lock conversion is allowed, then upgrading of locks (i. e. from read_locked to write_locked) must be done during the expanding phase, and downgrading of locks (from write_locked to read_locked) must be done in shrinking phase. • A transaction that follows two phase locking guarantees serializability.

• Two phase locking limits the amount of concurrency that can occur in a schedule. • This is because transaction T may not be able to release an item X after it is finished using it. • If T wants to lock an additional item Y later on, T must lock it before it needs. • Other transaction wanting to lock Y cannot lock if it because T has already locked it even though it is not using Y. • Hence the other transaction is forced to wait until T goes into shrinking phase.

Basic Two Phase Locking: • The technique discussed above is known as basic two phase locking. Conservative Two Phase Locking: • Conservative 2 PL requires a transaction to locl all the items it accesses before the transaction begins execution by pre declaring its write set and read set. • If any of the pre declared items needed cannot be locked, the transaction does not lock any item; instead it waits until all the items are available for locking.

Strict 2 Phase Locking: • A transaction T does not release any of its exclusive (write) locks until after it commits or aborts. • Hence no other transaction can read or write an item that is written by T unless T has committed. Rigorous 2 Phase Locking: • A transaction T does not release any of its locks (exclusive or shared) until after it commits or aborts and so it is easier to implement than strict 2 PL.

Deadlock: • Deadlock occurs when each transaction T in a set of two or more transactions is waiting for some item that is locked by some other transaction T 1 in the set. • Hence each transaction in the set is on a waiting queue, waiting for one of the other transactions in the set to release the lock on an item.

Example: T 1 read_lock(y); Read_item(y); T 2 read_lock(x); read_item(x); Write_lock(x) write_lock(y);

• In the above example, transactions T 1 & T 2 are deadlocked. • T 1 is on the waiting queue for X which is locked by T 2 and T 2 is on the waiting queue for Y, which is locked by T 1. • Meanwhile, neither T 1 nor T 2 nor any other transaction can access items X and Y. T 1 T 2

Deadlock prevention Techniques: • Deadlock prevention protocols ensure that the system will never enter into a deadlock state. • Some prevention strategies : 1. Require that each transaction locks all its data items before it begins execution (pre declaration). 2. Impose partial ordering of all data items and require that a transaction can lock data items only in the order specified by the partial order (graph based protocol).

Timestamp: • A technique used to prevent deadlock is by using the concept of transaction timestamp TS(T). • Transaction timestamp is a unique identifier assigned to each transaction. • The timestamps are typically based on the order in which transactions are started. • Hence if transaction T 1 starts before transaction T 2, then TS(T 1) < TS (T 2). • Older transactions has a smaller timestamp value.

• Two schemes that prevent deadlock are called: 1. wait-die 2. wound-wait. • Suppose that transaction Ti tries to lock an item X but is not able to because X is locked by some other transaction Tj with a conflicting lock, • The rules followed by these schemes are as:

Wait – Die: • If TS(Ti) < TS(Tj) then Ti is allowed to wait; otherwise (Ti younger than Tj) abort Ti(Ti dies) and restart it later with the same timestamp. Wound – Wait: • If TS(Ti) < TS(Tj) (i. e. Ti older than Tj) abort Tj(Ti wounds Tj) and restart it later with the same timestamp; otherwise (Ti younger than Tj) Ti is allowed to wait.

• In wait – die, an older transaction is allowed to wait on a younger transaction, whereas a younger transaction requesting an item held by an older transaction is aborted and restarted. • The wound – wait approach does the opposite. • A younger transaction is allowed to wait whereas an older transaction requesting an item held by a younger transaction preempts the younger transaction by aborting it. • Both schemes end up aborting the younger of the two transactions that may be involved in a deadlock.

• Deadlock prevention without using timestamps: 1. No Waiting: • In no waiting algorithm if a transaction is unable to obtain a lock, it is immediately aborted and then restarted after a certain time delay without checking whether a deadlock will actually occur or not. This scheme causes transactions to abort and restart needlessly.

2. Cautious Waiting: • Cautious waiting was proposed to try to reduce the number of needless aborts/restarts. Suppose transaction Ti tries to lock an item X but is not able to do so because X is locked by some other transaction Tj with a confliction lock, CW states that if Tj is not blocked, then Ti is blocked and allowed to wait; otherwise abort Ti.

Deadlock detection and Timeouts: • Another approach to deal with deadlock is deadlock detection. • Here the system checks if a state of deadlock actually exists. • This solution is practicable if we know there will be little interference among the transactions i. e. if different transactions will rarely access the same items at the same time. • This can happen if the transactions are short and each transaction locks only a few items or If the transaction load is light.

• A simple way to detect a state of deadlock is for the system to construct and maintain a Wait-For graph. • One node is created for each transaction that is concurrently executing. • Whenever a transaction Ti is waiting to lock an item X that is currently locked by a transaction Tj, a directed edge (Ti Tj) is created in the wait for graph. • When Tj releases the lock on the items that Ti was waiting for the directed edge is dropped from the wait for graph.

Timeouts: • Simple scheme to deal with deadlock is the use of timeouts. • This method is practical because of its low overhead and simplicity. • In this method, if a transaction waits for a period longer that a system defined timeout period, the system assumes that the transaction may be deadlocked and aborts it regardless of whether a deadlock actually exists or not.

Starvation: • Another problem that may occur when using locking is starvation. • It occurs when a transaction cannot proceed for an indefinite period of time while other transactions in the system continue normally. • This may occur if the waiting scheme for starvation is unfair, giving priority to some transactions over others. • One solution for starvation is to have a fair waiting scheme, such as using a first – come- first –serve queue; transactions are enabled to lock an item in the order in which they originally requested the lock.