Distributed Transactions Distributed deadlocks Recovery techniques Distributed Deadlocks

Distributed Transactions: Distributed deadlocks Recovery techniques

Distributed Deadlocks l We have talked about deadlocks in a single server environment Deadlocks have to be either m Prevented or m Detected and resolved l Distributed deadlock l Detection Global wait-for graph Simple idea m Central server takes the role of global deadlock detector

Interleavings of transactions U, V and W U d. deposit(10) V lock D b. deposit(10) a. deposit(20) b. withdraw(30) W lock A at X lock B at Y c. deposit(30) lock C at Z a. withdraw(20) wait at X wait at Y c. withdraw(20) wait at Z

Distributed deadlock (a) (b) W Held by C D A X Z Waits for W Waits for Held by V U Waits for B Y U

Phantom Deadlocks A deadlock is detected but not really a deadlock In distributed deadlock detection, servers pass along information about wait-for relationships => if there’s a deadlock it’ll be eventually detected at one place Due to elapsed time, there may be a situation that an object was detected as locked, which may later on be released. local wait-for graph T x X local wait-for graph U V T->U->V->T global deadlock detector T T Y U V

Edge-chasing algorithm Probes transmitted to detect deadlock W W® U ® V ® W Held by Waits for Deadlock detected C A Z W® U ® V Waits for Initiation W® U V U Held by Y B Waits for X

Two probes initiated Waits for V Waits for T U W (c) detection initiated at object requested by W (b) detection initiated at object requested by T (a) initial situation Waits for T V T ®U T W U T®U®W®V T®U®W ®V® W V ®V®T®U U W W T ®V W Waits for

Probes travel downhill. . (a) V stores probe when U starts waiting (b) Probe is forwarded when V starts waiting W U V probe queue U ®V Waits for B W U®V V® W Waits for C U ®V V U®V probe queue U Waits for B

Recovery Atomicity property m Durability and failure atomicity • Durability: objects are saved in a permanent storage • Failure: effects of transaction are atomic even when the server crashes m Assumptions • a running server keeps all its objects in volatile memory and records of committed transactions in a recovery file. m Recovery: • restoring the server with its latest committed versions of its objects from the permanent storage m Recovery manager: • save objects in permanent storage for committed transactions • restore server’s objects after a crash • reorganize the recovery file to improve performance • reclaim storage space

Types of entry in a recovery file To deal with the recovery process: Object value, status of the transaction, Intentions list (list of references and values of all objects altered by that transaction, useful in 2 PC) Type of entry Description of contents of entry Object A value of an object. Transaction status Transaction identifier, transaction status ( prepared , committed aborted ) and other status values used for the two-phase commit protocol. Intentions list Transaction identifier and a sequence of intentions, each of which consists of <identifier of object>, <position in recovery file of value of object>.

Logging l Logs Contain history of all transactions performed by a server Recovery manager is called when a server is m. Prepared to commit a transaction • appends all the objects in the intentions list to a recovery file, followed by the current status of the transaction m. Commits/Aborts • appends the corresponding status of the transaction After a crash: many transaction that doesn’t have a committed status is aborted. (so, when a transaction commits, its status is forced onto the log)

Example: Logging Wh en T com Aa mits nd B , upd a r e ated at P 1, P 2 Snapshot of the values of A, B, C T commits U prepares for commit P 0 P 1 P 2 P 3 P 4 P 5 P 6 Object: A Object: B Object: C Object: A Object: B Trans: T Object: C Object: B 100 200 300 80 220 prepared committed 278 242 <A, P 1> <B, P 2> P 0 P 3 Checkpoint P 7 Trans: U prepared <C, P 5> <B, P 6> P 4 End of log

Recovery l Server restarts after crash Sets default initial values for objects and starts the recovery manager. Two approaches: ¶ Forward: m The recovery manager starts from the beginning and restores all the object values starting from the most recent checkpoint. m It then reads the values of all the objects and associates them with their intentions list, for committed transactions replaces the values of the obj. m Like replaying · Backward: m Uses the backward pointers m Committed transactions are used to restore objects that are not restored. m It continues until all of the server’s objects are restored.

Importance of Checkpointing l It saves lots of extra work l What happens if there were no checkpoints? P 0 P 1 P 2 P 3 P 4 P 5 P 6 Object: A Object: B Object: C Object: A Object: B Trans: T Object: C Object: B 100 200 300 80 220 prepared committed 278 242 <A, P 1> <B, P 2> P 0 P 3 Checkpoint To restore the value of C We have to go up to P 0 to restore all objects P 7 Trans: U prepared <C, P 5> <B, P 6> P 4 End of log

Shadow versions An alternate approach m uses a map to locate versions of objects in a file called version store m map -- Object id to positions of current version in version store m The versions written by each transactions are shadows of committed transactions How does it work: m When a transaction is prepared to commit, any of the objects changed by the transaction is added to the version store => “shadow version” m When a transaction commits, a new map is made by copying the old map and entering the positions of the shadow versions. m The new map replaces the old map. Recovery: m The RM reads the map and uses it to locate the objects in the version store.

Shadow versions P 0’’ P 1 P 2 P 3 P 4 P 5 P 6 Object: A Object: B Object: C Object: A Object: B Trans: T Object: C Object: B 100 200 300 80 220 prepared committed 278 242 <A, P 1> <B, P 2> P 0 P 3 Checkpoint End of log Map at start Map when T commits A ®P 0 B ® P 0' A ®P 1 B ®P 2 C ® P 0" Version store P 7 Trans: U prepared <C, P 5> <B, P 6> P 4 P 0' P 0" 100 200 300 Checkpoint P 1 80 P 2 P 3 P 4 220 278 242

Log with entries relating to two-phase commit protocol Trans: T Coord’r: T Trans: T prepared part’pant list: . . . committed prepared intentions list Trans: U intentions list Part’pant: U Trans: U Coord’r: . . uncertain committed

Recovery of the two-phase commit protocol Role Status Action of recovery manager Coordinator prepared Coordinator committed Participant uncertain Participant Coordinator prepared done No decision had been reached before the server failed. It sends abort. Transaction to all the servers in the participant list and adds the transaction status aborted in its recovery file. Same action for state aborted. If there is no participant list, the participants will eventually timeout and abort the transaction. A decision to commit had been reached before the server failed. It sends a do. Commit to all the participants in its participant list (in case it had not done so before) and resumes the two-phase protocol at step 4 (Fig 13. 5). The participant sends a have. Committed message to the coordinator (in case this was not done before it failed). This will allow the coordinator to discard information about this transaction at the next checkpoint. The participant failed before it knew the outcome of the transaction. It cannot determine the status of the transaction until the coordinator informs it of the decision. It will send a get. Decision to the coordinator to determine the status of the transaction. When it receives the reply it will commit or abort accordingly. The participant has not yet voted and can abort the transaction. No action is required.