Distributed Databases Distributed Course outlines Transactions Commit Protocols
Distributed Databases “Distributed Course outlines Transactions. Commit Protocols” Distributed Transactions - Overview System Failure Modes Commit Protocols - Two Phase Commit Protocol (2 PC) Phase 1: Obtaining a Decision Phase 2: Recording the Decision Handling of Failures Site Failure, Coordinator Failure Network Partition Recovery and Concurrency Control Alternative Models - Persistent messaging systems Error Conditions with Persistent Messaging and Workflows Implementation of Persistent Messaging Annex - Three Phase Commit (3 PC), Transactional Workflows
Distributed Transactions Overview 1/2 In distributed database systems, transaction may access data at several sites. Concepts : local and global transactions Ê A local transaction accesses data in the single site at which the transaction was initiated. Ê A global transaction either accesses data in a site different from the one at which the transaction was initiated or accesses data in several different sites. To deal with such kind of distributed transactions, each site should have: Êa local transaction manager (TM). Êa transaction coordinator (TC). 2
Distributed Transactions 2/2 Local / global transa Overview Transaction may access data at several sites. Transaction System Architecture Each site has: TC TC a local transaction manager (TM) responsible for: Ê Maintaining a log for recovery purposes TM Ê Participating in coordinating the concurrent execution of the transactions executing at that site. S 1 TC TM TM Si Sn a transaction coordinator (TC), which is responsible for: Ê Starting the execution of transactions that originate at the site. Ê Distributing subtransactions at appropriate sites for execution. Ê Coordinating the termination of each transaction that originates at the site, 3
Distributed Transactions System Failure Modes Failures unique to distributed systems: Ê Failure of a site. Ê Loss of massages – Handled by network transmission control protocols such as TCP-IP Ê Failure of a communication link – Handled by network protocols, by routing messages via alternative links Ê Network partition- a network is said to be partitioned when it has been split into two or more subsystems that lack any connection between them ÊNote: a subsystem may consist of a single node Ê Network partitioning and site failures are generally 4
Distributed Transactions Commit Protocols Commit protocols are used to ensure atomicity across sites Ê a transaction which executes at multiple sites must either be Êcommitted at all the sites, Êor aborted at all the sites. O not acceptable to have a transaction Êcommitted at one site Êand aborted at another Ê A log is maintained at each site, and in addition to the kinds of information maintained in a centralized DBMS, actions taken as part of the commit protocol are also logged Ê The two-phase commit (2 PC) protocol is most widely used Ê The three-phase commit (3 PC) protocol is more 5
Distributed Transactions – Commit protocol Two Phase Commit Protocol (2 PC) Assumes fail-stop model Ê failed sites simply stop working, Ê and do not cause any other harm, such as sending incorrect messages to other sites. Execution of the protocol is initiated by the coordinator (TC) after the last step of the transaction has been reached. Ê The TM at that site breaks transaction up into a collection of sub-transactions that execute at different sites Ê The protocol involves all the sites at which the transaction executed 6
Distributed Transactions – Commit protocol -2 PC Phase 1: Obtaining a Decision voting phase Let T be a transaction initiated at site Si (Si. C the coordinator) … Si. C asks all participants to prepare to commit Si <prepare transaction Ti, T> … Abort/read Ê adds the records <prepare T> to the log y T from “prepare T” Sj Ê and forces log to stable storage Ê sends prepare T messages to all sites at which T executed “abort T” Sj <no T> Send “ready T”abort or Sj: Upon receiving message, TM at remote site determines if it can commit the transaction (decides whether to abort or commit its sub-transactions) Ê if not, add a record <no T> to the log and send abort T message to Si Ê if the transaction can be committed, then: <ready T> Send ready T 7
Distributed Transactions – Commit protocol -2 PC Phase 2: Recording the Decision termination phase T can be committed if Si. C received a ready T message from all the participating sites Otherwise T must be aborted (it receives even one no message, or does not receive any response from some participating site for a specified time-out interval). Coordinator Ê adds a decision record, <commit T> or <abort T>, to the log Ê forces record onto stable storage. Once the record stable storage it is irrevocable (even if failures occur) Ê sends a message to each participant informing it of the decision (commit or abort) Participants take appropriate action locally. Ê When a participating site receives an abort Ê it force-writes an abort log record, Ê sends an ack message to the coordinator Ci Ê and abort the sub-transaction message, Ê When a participating site receives an commit Ê it force-writes a commit log record, Ê sends an ack message to the coordinator Ci Ê and commits the sub-transaction message, 8
Distributed Transactions – 2 PC protocol - Handling of Failures - Site Failure When site Si recovers, it examines its log to determine the fate (destiny) of transactions active at the time of the failure. Ê Log contains <commit T> record: site executes redo (T) Ê Log contains <abort T> record: site executes undo (T) C must periodically resend a commit or abort message to each p-site until we receive an ack Ê Log contains <ready T> record; site (is participating site) must repeatedly contact the coordinator site to determine the fate (status) of T. Once Si. C responds with either commit or abort: Ê If T committed, redo (T) Ê If T aborted, undo (T) Ê The log contains no control records concerning T , that Sk failed before responding to the prepare T message from Si. C (no way to be determined) 9
Distributed Transactions – 2 PC protocol - Handling of Failures - Coordinator Failure If coordinator fails while the commit protocol for T is executing Participating sites must decide on T’s fate (destiny): 1. If an active site contains : Ê a <commit T> record in its log, then T must be committed. Ê an <abort T> record in its log, then T must be aborted. 2. If some active participating site does not contain a <ready T> record in its log, then the failed coordinator cannot have decided to commit T. Ê Can therefore abort T. 3. If none of the above cases holds, 1. then all active sites must have a <ready T> record in their logs, 2. but no additional control records (such as <abort T> of <commit T>). Ê In this case active sites must wait for coordinator S. C to 10
Distributed Transactions – 2 PC protocol - Handling of Failures - Network Partition Ê If the coordinator and all its participants remain in one partition, the failure has no effect on the commit protocol. Ê If the coordinator and its participants belong to several partitions: Sites that are not in the partition containing the coordinator think the coordinator has failed, and execute the protocol to deal with failure of the coordinator. Ê The coordinator and the sites are in the same partition as the coordinator think that the sites in the other 11 partition have failed, and follow the usual commit
Distributed Transactions – 2 PC protocol Recovery and Concurrency Control In-doubt (undecided) transactions have a <ready T>, but neither a <commit T>, nor an <abort T> log record. Ê The recovering site must determine the commit-abort status of such transactions by contacting other sites; C this can slow and potentially block recovery. Ê Recovery algorithms can note lock information in the log. Ê Instead of <ready T>, write out <ready T, L> L = list of locks held by T when the log is written (read locks can be omitted). Ê For every in-doubt transaction T, all the locks noted in the <ready T, L> log record are reacquired. Ê After lock reacquisition, transaction processing can 12
Distributed Transactions - Transaction Processing Alternative Models (1/2) Notion of a single transaction spanning multiple sites is inappropriate for many applications E. g. transaction crossing an organizational boundary No organization would like to permit an externally initiated transaction to block local transactions for an indeterminate period Alternative models carry out transactions by sending messages Code to handle messages must be carefully designed to ensure atomicity and durability properties for updates Ê Isolation cannot be guaranteed, in that intermediate stages are visible, 13
Distributed Transactions Alternative Models (2/2) Messaging Motivating example: funds transfer between two banks Ê Two phase commit would have the potential to block updates on the accounts involved in funds transfer Ê Alternative solution: ÊDebit money from source account and send a message to other site ÊSite receives message and credits destination account Messaging has long been used for distributed transactions (even before computers were invented!) Atomicity issue Ê once transaction sending a message is committed, message must guaranteed to be delivered (guarantee as long as 14
Distributed Transactions - Alternative Models Error/ Messaging Conditions with Persistent Messaging Ê Code to handle messages has to take care of variety of failure situations Ê E. g. if destination account does not exist, failure message must be sent back to source site Ê When failure message is received from destination site, or destination site itself does not exist, money must be deposited back in source account Problem if source account has been closed (get humans to take care of problem) C User code executing transaction processing using 2 PC does not have to deal with such failures Ê There are many situations where extra effort of error handling is worth the benefit of absence of blocking 15
Distributed Transactions - Alternative Models / Messaging Persistent Messaging and Workflows provide a general model of transactional processing involving multiple sites (involve the coordinated execution of multiple tasks performed by different processing entities) and possibly human processing of certain steps (see annex) E. g. when a bank receives a loan application, it may need to ÊContact external credit-checking agencies ÊGet approvals of one or more managers C and then respond to the loan application ÊPersistent messaging forms the underlying 16
Distributed Transactions - Alternative Models / Messaging Implementation of Persistent Messaging Sending site protocol Ê Sending transaction writes message to a special relation/table messages-to-send. The message is also given a unique identifier. Ê A message delivery process monitors the messages-to-send relation Ê When a new message is found, the message is sent to its destination Ê When an acknowledgment is received from a destination, the message is deleted from messages-to-send Ê If no acknowledgment is received after a timeout period, the message is resent Receiving site protocol Ê When a message is received Êit is written to a received-messages relation if it is not already present (the message id is used for this check). The transaction performing the write is committed ÊAn acknowledgement (with message id) is then sent to the sending site. Ê There may be very long delays in message delivery coupled 17 with repeated messages
Annex… A complement Three Phase Commit (3 PC) Transactional Workflows Ê Workflows are activities that involve the coordinated execution of multiple tasks performed by different processing entities. Ê With the growth of networks, and the existence of multiple autonomous database systems, workflows provide a convenient way of carrying out tasks that involve multiple systems.
Three Phase Commit. Assumptions: No network partitioning At any point, at least one site must be up. (3 PC) At most K sites (participants / coordinator) can Identical fail Phase 1: Obtaining Preliminary Decision: to 2 PC Phase 1. Ê Every site is ready to commit if instructed to do so Phase 2 of 2 PC is split into 2 phases, Phase 2 and Phase 3 of 3 PC Ê In phase 2 coordinator makes a decision as in 2 PC (called the precommit decision) and records it in multiple (at least K) sites Ê In phase 3, coordinator sends commit/abort message to all participating sites, Under 3 PC, knowledge of pre-commit decision can be used to commit despite coordinator failure Ê Avoids blocking problem as long as < K sites fail Drawbacks: 19
Transactional Workflows Example of a workflow delivery of an email message, which goes through several mails systems to reach destination. Ê Each mailer performs a tasks: forwarding of the mail to the next mailer. Ê If a mailer cannot deliver mail, failure must be handled semantically (delivery failure message). O Workflows usually involve humans: e. g. loan processing, or purchase order processing. Loan Processing Workflow C In the past, workflows were handled by creating and forwarding paper forms C Computerized workflows aim to automate many of the tasks. But the humans still play role e. g. in approving loans. 20
Extend transaction concepts to the context of workflows. Must address following issues to computerize a workflow. Transactional Workflows Ê Specification (static/dynamic) of workflows - detailing the tasks that must be carried out and defining the execution requirements (dependencies, preconditions) Ê Execution of workflows - execute transactions specified in the workflow while also providing traditional database safeguards ACID transactional requirements Workflow management systems include: Ê Scheduler - program that process workflows, monitoring various events, and evaluation conditions related to intertask dependencies Ê Task agents - control the execution of a task by a processing entity. Ê Mechanism to query to state of the workflow system. O State of a workflow - consists of the collection of states of its constituent task Failure-Atomicity Requirements 21
- Slides: 21