CS 542 Topics in Distributed Systems Distributed Transactions
CS 542: Topics in Distributed Systems Distributed Transactions and Two Phase Commit Protocol
Distributed Transactions v A transaction that invokes operations at several servers. X T 11 A T 1 A H T B T Y T 2 C Z D Flat Distributed Transaction T 12 B T 21 C K D T 22 F Nested Distributed Transaction
Coordinator of a Distributed Transaction • In a distributed environment, a coordinator is needed • Client sends an open. Transaction to the coordinator – Other servers that manage the objects accessed by the transaction become participants. – The coordinator provides a join method interface
Distributed banking transaction Coordinator open. Transaction close. Transaction join participant A join A. withdraw(4); Branch. X T Client T = open. Transaction A. withdraw(4); C. deposit(4); B. withdraw(3); D. deposit(3); close. Transaction participant B. withdraw(T, 3); B join Note: the coordinator is in one of the servers, e. g. Branch. X B. withdraw(3); Branch. Y participant C D Branch. Z C. deposit(4); D. deposit(3);
Atomic Commit Problem v Atomicity principle requires that either all the distributed operations of a transaction complete, or all abort. v. At some stage, client executes close. Transaction(). Now, atomicity requires that either all participants (remember these are on the server side) and the coordinator commit or all abort. v. What problem statement is this?
Atomic Commit Protocols v. Consensus, but the system is asynchronous!! v. So, need to ensure safety property in real-life implementation. Never have some agreeing to commit, and others agreeing to abort. v. First cut: one-phase commit protocol. The coordinator unilaterally communicates either commit or abort, to all participants (servers) until all acknowledge. v. Doesn’t work when a participant crashes before receiving this message (partial transaction results that were in memory are lost). v. Does not allow participant to abort the transaction, e. g. , under error conditions.
Atomic Commit Protocols v. Consensus, but it’s impossible in asynchronous networks! v. So, need to ensure safety property in real-life implementation. Never have some committing while others abort. Err on the side of safety. v. Alternative: Two-phase commit protocol v. First phase involves coordinator collecting a vote (commit or abort) from each participant v. Participant stores partial results in permanent storage before voting v. Now coordinator makes a decision vif all participants want to commit and no one has crashed, coordinator multicasts “commit” message v. Everyone commits v. If participant fails, then on recovery, can get commit msg from coord velse if any participant has crashed or aborted, coordinator multicasts “abort” message to all participants v. Everyone aborts
RPCs for Two-Phase Commit Protocol can. Commit? (trans)-> Yes / No Call from coordinator to participant to ask whether it can commit a transaction. Participant replies with its vote. Phase 1. do. Commit(trans) Call from coordinator to participant to tell participant to commit its part of a transaction. Phase 2. do. Abort(trans) Call from coordinator to participant to tell participant to abort its part of a transaction. Phase 2. get. Decision(trans) -> Yes / No Call from participant to coordinator to ask for the decision on a transaction after it has voted Yes but has still has received no reply within timeout. Also used to recover from server crash or delayed messages. have. Committed(trans, participant) Call from participant to coordinator to confirm that it has committed the transaction. (May not be required if get. Decision() is used)
The two-phase commit protocol Phase 1 (voting phase): 1. The coordinator sends a can. Commit? request to each of the participants in the transaction. 2. When a participant receives a can. Commit? request, it replies with its vote (Yes or No) to the coordinator. Before voting Yes, it “prepares to commit” Recall that a by saving objects in permanent storage. If its vote is No, the participant server may crash aborts immediately. Phase 2 (completion according to outcome of vote): 3. The coordinator collects the votes (including its own), makes a decision, and logs this on disk. (a) If there are no failures and all the votes are Yes, the coordinator decides to commit the transaction and sends a do. Commit request to each of the participants. (b) Otherwise the coordinator decides to abort the transaction and sends do. Abort requests to all participants that voted Yes. This is the step erring on the side of safety. 4. Participants that voted Yes are waiting for a do. Commit or do. Abort request from the coordinator. When a participant receives one of these messages, it acts accordingly – when committed, it makes a have. Committed call. • If it times out waiting for a do. Commit/do. Abort, participant keeps sending a get. Decision to coordinator, until it knows of the decision
Communication in Two-Phase Commit Coordinator Participant step status 1 3 prepared to commit (waiting for votes) committed can. Commit? Yes 2 prepared to commit (uncertain) 4 committed do. Commit have. Committed done v To deal with participant crashes v Each participant saves tentative updates into permanent storage, right before replying yes/no in first phase. Retrievable after crash recovery. v Coordinator logs votes and decisions too v To deal with can. Commit? loss v The participant may decide to abort unilaterally after a timeout for first phase (participant eventually votes No, and so coordinator will also eventually abort) v To deal with Yes/No loss, the coordinator aborts the transaction after a timeout (pessimistic!). It must annouce do. Abort to those who sent in their votes. v To deal with do. Commit loss v The participant may wait for a timeout, send a get. Decision request (retries until reply received). Cannot unilaterally abort/commit after having voted Yes but before receiving do. Commit/do. Abort!
Two Phase Commit (2 PC) Protocol Close. Trans() Coordinator Participant Execute not ready • Precommit request Uncertain • Send NO to • Send request to coordinator each participant NO • Wait for replies coordinator decision All YES COMMIT decision Abort Commit • Send ABORT to Commit • Send COMMIT to • Make each participant • Precommit • send YES to • Wait for YES (time out possible) Timeout or a NO Abort ready each participant transaction visible ABORT decision Abort
- Slides: 11