TRANSACTIONS AND EVENTDRIVEN PROGRAMMING EE 324 Concurrency Control

Concurrency Control 2 General organization of managers for handling

Two-phase locking is “pessimistic” 3 Acts to prevent non-serializable schedules from arising: pessimistically assumes

Transactions T and U with exclusive locks 4 Transaction T: balance = b. get.

Schemes for Concurrency control 5 Locking Optimistic concurrency control (CDK 516. 5) Time-stamp based

Optimistic Concurrency Control 6 Drawbacks of locking � Overhead of lock maintenance � Deadlocks

Optimistic Concurrency Control 7 Three phases: � Working Phase – transactions read and write

Validation of transactions 8 Working Validation Update T 1 Earlier committed transactions T 2

Validation uses read/write conflict rules Transaction being validated Earlier committed transactions Tv Ti Rule

Optimistic concurrency control Transactions are numbered at the validation phase. Validation and update occurs

Today's Lecture 11 Distributed transactions

Distributed Transactions 12 Motivation � Provide distributed atomic operations at multiple servers that maintain

Transactions in distributed systems 13 Main issue that arises is that now we can

Atomic Commit Protocols 14 The atomicity of a transaction requires that when a distributed

The two-phase commit protocol (2 PC) Designed to allow any participant to abort its

The two-phase commit protocol - 1 16 Phase 1 (voting phase): 1. The coordinator

The two-phase commit protocol - 2 17 Phase 2 (completion according to outcome of

18 Communication in two-phase commit protocol Coordinator Participant step status 1 3 prepared to

Commit protocol illustrated 19 ok to commit?

Commit protocol illustrated 20 ok to commit? commit ok with us

21 Operations for two-phase commit protocol can. Commit? (trans)-> Yes / No Call from

22 Two-Phase Commit protocol – 3 (TV sec 8. 5) actions by coordinator: while

Two-Phase Commit protocol - 4 23 actions by participant: write INIT to local log;

Two-Phase Commit protocol - 5 24 The finite state machine for the coordinator in

Two Phase Commit Protocol - 6 25 Timeouts ‘Wait’ in Coordinator – use a

Two Phase Commit Protocol - 7 Recovery � To ensure that a process can

Two-Phase Commit protocol - 8 27 actions for handling decision requests after recovery: /*

Three Phase Commit protocol - 1 28 Problem with 2 PC � If coordinator

Three-Phase Commit protocol - 2 29 a) Finite state machine for the coordinator in

Three Phase Commit Protocol - 3 30 Recovery ‘Wait’ in Coordinator – same ‘Init’

Things we have learned so far… ACID Concurrency control Distributed atomic commit

Two Views of Distributed Systems 32 Optimist: A distributed system is a collection of

Recurring Theme 33 Academics like: � Clean abstractions � Strong semantics � Things that

A Clash of Cultures 34 Classic distributed systems: focused on ACID semantics (transaction semantics)

ACID vs BASE ACID Strong consistency for transactions highest priority Availability less important Pessimistic

Why Not ACID+BASE? 36 What goals might you want from a system? � C,

CAP Theorem [Brewer] 37 You can only have two out of these three properties

Consistency and Availability 38 Comment: � Providing transactional semantics requires all functioning nodes to

Partition-Tolerance and Availability 39 Comment: � Once consistency is sacrificed, life is easy…. Examples:

Voting with their Clicks 40 In terms of large-scale systems, the world has voted

Slides: 40

Download presentation

TRANSACTIONS (AND EVENTDRIVEN PROGRAMMING) EE 324

Concurrency Control 2 General organization of managers for handling

Two-phase locking is “pessimistic” 3 Acts to prevent non-serializable schedules from arising: pessimistically assumes conflicts are fairly likely Can deadlock, e. g. T 1 reads x then writes y; T 2 reads y then writes x. This doesn’t always deadlock but it is capable of deadlocking � Overcome by aborting if we wait for too long, � Or by designing transactions to obtain locks in a known and agreed upon ordering

Transactions T and U with exclusive locks 4 Transaction T: balance = b. get. Balance() b. set. Balance(bal*1. 1) a. withdraw(bal/10) Operations Locks Transaction U: balance = b. get. Balance() b. set. Balance(bal*1. 1) c. withdraw(bal/10) Operations Locks open. Transaction bal = b. get. Balance() lock B open. Transaction b. set. Balance(bal*1. 1) a. withdraw(bal/10) lock A close. Transaction unlock. A, B bal = b. get. Balance() waits for. T’s lock on. B lock B b. set. Balance(bal*1. 1) c. withdraw(bal/10) lock C close. Transaction unlock. B, C

Schemes for Concurrency control 5 Locking Optimistic concurrency control (CDK 516. 5) Time-stamp based concurrency control (not going to cover)

Optimistic Concurrency Control 6 Drawbacks of locking � Overhead of lock maintenance � Deadlocks � Reduced concurrency Optimistic Concurrency Control � In most applications, likelihood of conflicting accesses by concurrent transactions is low � Transactions proceed as though there are no conflicts

Optimistic Concurrency Control 7 Three phases: � Working Phase – transactions read and write private copies of objects (most recently committed) � Validation Phase – Once transaction is done, the transaction is validated to establish whether or not its operations on objects conflict with operations of other transactions on the same object. If not conflict, can commit; else some form of conflict resolution is needed and the transaction may abort. � Update Phase – if commit, private copies are used to make permanent change.

Validation of transactions 8 Working Validation Update T 1 Earlier committed transactions T 2 T 3 Transaction Tv being validated active Later active transactions 1 active 2

Validation uses read/write conflict rules Transaction being validated Earlier committed transactions Tv Ti Rule Write Read Write Ti must not read objects written by Tv Tv must not read objects written by Ti Write Ti must not write objects written by Tv and Tv must not write objects written by Ti rule 1 rule 2 rule 3

Optimistic concurrency control Transactions are numbered at the validation phase. Validation and update occurs in side the critical section. (Satisfies rule 3) Backward validation � Rule 1 is satisfied because all read operations of earlier overlapping transactions were performed before the validation of Tv started; they cannot be affected by Tv’s write. � Read set of Tv must be compared with the write sets of T 2 and T 3. (Rule 2) � In other words, the read set of the transaction being validated is compared with the write set of other overlapping transactions that have already committed.

Today's Lecture 11 Distributed transactions

Distributed Transactions 12 Motivation � Provide distributed atomic operations at multiple servers that maintain shared data for clients � Provide recoverability from server crashes Properties � Atomicity, Consistency, Isolation, Durability (ACID) Concepts: commit, abort, distributed commit

Transactions in distributed systems 13 Main issue that arises is that now we can have multiple database servers that are touched by one transaction Reasons? � Data spread around: each owns subset � Could have replicated some data object on multiple servers, e. g. to load-balance read access for large client set � Might do this for high availability

Atomic Commit Protocols 14 The atomicity of a transaction requires that when a distributed transaction comes to an end, either all of its operations are carried out or none of them One phase commit � Coordinator tells all participants (servers) to commit Keep on repeating it until all participants reply If a participant cannot commit (say because of concurrency control), no way to inform coordinator. Also, no way for the coordinator to abort.

The two-phase commit protocol (2 PC) Designed to allow any participant to abort its part of transaction But this means, the whole transaction must be aborted � Why? First phase: all participants vote (abort or commit). If voted commit, make all changed permanent (durability) and go to prepared state. Log this fact. � Participants will eventually commit (if the coordinator says so) even it crashes. Second phase: Joint decision

The two-phase commit protocol - 1 16 Phase 1 (voting phase): 1. The coordinator sends a can. Commit? (VOTE_REQUEST) request to each of the participants in the transaction. 2. When a participant receives a can. Commit? request it replies with its vote Yes (VOTE_COMMIT) or No (VOTE_ABORT) to the coordinator. Before voting Yes, it prepares to commit by saving objects in permanent storage. If the vote is No the participant aborts immediately.

The two-phase commit protocol - 2 17 Phase 2 (completion according to outcome of vote): 3. The coordinator collects the votes (including its own). (a) If there are no failures and all the votes are Yes the coordinator decides to commit the transaction and sends a do. Commit (GLOBAL_COMMIT) request to each of the participants. (b)Otherwise the coordinator decides to abort the transaction and sends do. Abort (GLOBAL_ABORT) requests to all participants that voted Yes. 4. Participants that voted Yes are waiting for a do. Commit or do. Abort request from the coordinator. When a participant receives one of these messages it acts accordingly and in the case of commit, makes a have. Committed call as confirmation to the coordinator.

18 Communication in two-phase commit protocol Coordinator Participant step status 1 3 prepared to commit (waiting for votes) committed can. Commit? Yes prepared to commit (uncertain) 4 committed do. Commit have. Committed done 2

Commit protocol illustrated 19 ok to commit?

Commit protocol illustrated 20 ok to commit? commit ok with us

21 Operations for two-phase commit protocol can. Commit? (trans)-> Yes / No Call from coordinator to participant to ask whether it can commit a transaction. Participant replies with its vote. do. Commit(trans) Call from coordinator to participant to tell participant to commit its part of a transaction. do. Abort(trans) Call from coordinator to participant to tell participant to abort its part of a transaction. have. Committed(trans, participant) Call from participant to coordinator to confirm that it has committed the transaction. get. Decision(trans) -> Yes / No Call from participant to coordinator to ask for the decision on a transaction after it has voted Yes but has still had no reply after some delay. Used to recover from server crash or delayed messages.

22 Two-Phase Commit protocol – 3 (TV sec 8. 5) actions by coordinator: while START _2 PC to local log; multicast VOTE_REQUEST to all participants; while not all votes have been collected { wait for any incoming vote; if timeout { write GLOBAL_ABORT to local log; multicast GLOBAL_ABORT to all participants; exit; } record vote; } if all participants sent VOTE_COMMIT and coordinator votes COMMIT{ write GLOBAL_COMMIT to local log; multicast GLOBAL_COMMIT to all participants; } else { write GLOBAL_ABORT to local log; multicast GLOBAL_ABORT to all participants; }

Two-Phase Commit protocol - 4 23 actions by participant: write INIT to local log; wait for VOTE_REQUEST from coordinator; if participant votes COMMIT { write VOTE_COMMIT to local log; send VOTE_COMMIT to coordinator; wait for DECISION from coordinator; if timeout { multicast DECISION_REQUEST to other participants; wait until DECISION is received; /* remain blocked */ write DECISION to local log; } if DECISION == GLOBAL_COMMIT write GLOBAL_COMMIT to local log; else if DECISION == GLOBAL_ABORT write GLOBAL_ABORT to local log; } else { write VOTE_ABORT to local log; send VOTE ABORT to coordinator; }

Two-Phase Commit protocol - 5 24 The finite state machine for the coordinator in 2 PC. b) The finite state machine for a participant. Coordinator and participants have blocking state. When a failure occurs, other process may be indefinitely waiting. a)

Two Phase Commit Protocol - 6 25 Timeouts ‘Wait’ in Coordinator – use a time-out mechanism to detect participant crashes. Send GLOBAL_ABORT ‘Init’ in Participant – Can also use a time-out and send VOTE_ABORT ‘Ready’ in Participant P – abort is not an option (since already voted to COMMIT and so coordinator might eventually send GLOBAL_COMMIT). Can contact another participant Q and choose an action based on its state. State of Q Action by P COMMIT Transition to COMMIT ABORT Transition to ABORT INIT Both P and Q transition to ABORT (Q sends VOTE_ABORT) READY Contact more participants. If all participants are ‘READY’, must wait for coordinator to recover

Two Phase Commit Protocol - 7 Recovery � To ensure that a process can actually recover, it must save its state to persistent storage. � If a participant was in INIT (before crash), it can safely decide to locally abort when it recovers and inform the coordinator. � If it was COMMIT and ABORT, retransmit its decision to the coordinator. � If it was READY, contact other participant Q (Send DECISION_REQUEST), similar to the timeout situation.

Two-Phase Commit protocol - 8 27 actions for handling decision requests after recovery: /* executed by separate thread */ while true { wait until any incoming DECISION_REQUEST is received; read most recently recorded STATE from the local log; if STATE == GLOBAL_COMMIT send GLOBAL_COMMIT to requesting participant; else if STATE == INIT or STATE == GLOBAL_ABORT send GLOBAL_ABORT to requesting participant; else skip; /* participant remains blocked */ /* remain blocked */

Three Phase Commit protocol - 1 28 Problem with 2 PC � If coordinator crashes, participants cannot reach a decision, stay blocked until coordinator recovers Three Phase Commit (3 PC): proof in [SS 1983] � There is no single state from which it is possible to make a transition directly to either COMMIT or ABORT states � There is no state in which it is not possible to make a final decision, and from which a transition to COMMIT can be made

Three-Phase Commit protocol - 2 29 a) Finite state machine for the coordinator in 3 PC b) Finite state machine for a participant

Three Phase Commit Protocol - 3 30 Recovery ‘Wait’ in Coordinator – same ‘Init’ in Participant – same ‘Pre. Commit’ in Coordinator – Some participant has crashed but we know it wanted to commit. GLOBAL_COMMIT the application knowing that once the participant recovers, it will commit. ‘Ready’ or ‘Pre. Commit’ in Participant P – (i. e. P has voted to COMMIT) State of Q Action by P PRECOMMIT Transition to PRECOMMIT. If all participants in PRECOMMIT and form a majority, then COMMIT the transaction ABORT Transition to ABORT INIT Both P (in READY) and Q transition to ABORT (Q sends VOTE_ABORT). It can be shown that no other participants can be in PRECOMMIT READY Contact more participants. If can contact a majority and they are in ‘Ready’, then ABORT the transaction. If the participants contacted in ‘Pre. Commit’ it is safe to COMMIT the transaction Note: if any participant is in state PRECOMMIT, it is impossible for any other participant to be in any state other than READY or PRECOMMIT.

Things we have learned so far… ACID Concurrency control Distributed atomic commit

Two Views of Distributed Systems 32 Optimist: A distributed system is a collection of independent computers that appears to its users as a single coherent system Pessimist: “You know you have one when the crash of a computer you’ve never heard of stops you from getting any work done. ” (Lamport)

Recurring Theme 33 Academics like: � Clean abstractions � Strong semantics � Things that prove they are smart Users like: � Systems that work (most of the time) � Systems that scale � Consistency per se isn’t important Eric Brewer had the following observations

A Clash of Cultures 34 Classic distributed systems: focused on ACID semantics (transaction semantics) � Atomicity: either the operation (e. g. , write) is performed on all replicas or is not performed on any of them � Consistency: after each operation all replicas reach the same state � Isolation: no operation (e. g. , read) can see the data from another operation (e. g. , write) in an intermediate state � Durability: once a write has been successful, that write will persist indefinitely Modern Internet systems: focused on BASE � Basically Available � Soft-state (or scalable) � Eventually consistent

ACID vs BASE ACID Strong consistency for transactions highest priority Availability less important Pessimistic Rigorous analysis Complex mechanisms BASE Availability and scaling highest priorities Weak consistency Optimistic Best effort Simple and fast 35

Why Not ACID+BASE? 36 What goals might you want from a system? � C, A, P Strong Consistency: all clients see the same view, even in the presence of updates High Availability: all clients can find some replica of the data, even in the presence of failures Partition-tolerance: the system properties hold even when the system is partitioned

CAP Theorem [Brewer] 37 You can only have two out of these three properties The choice of which feature to discard determines the nature of your system

Consistency and Availability 38 Comment: � Providing transactional semantics requires all functioning nodes to be in contact with each other (no partition) Examples: Single-site and clustered databases � Other cluster-based designs � Typical Features: Two-phase commit � Cache invalidation protocols � Classic DS style �

Partition-Tolerance and Availability 39 Comment: � Once consistency is sacrificed, life is easy…. Examples: DNS � Web caches � Practical distributed systems for mobile environments: Coda, Bayou, Dropbox � Typical Features: Optimistic updating with conflict resolution � This is the “Internet design style” � TTLs and lease cache management �

Voting with their Clicks 40 In terms of large-scale systems, the world has voted with their clicks: � Consistency less important than availability and partition-tolerance