Eventual Consistency Jinyang Sequential consistency Sequential consistency properties
- Slides: 37
Eventual Consistency Jinyang
Sequential consistency • Sequential consistency properties: – Latest read must see latest write • Handles caching – All writes are applied in a single order • Handles concurrent writes • Realizing sequential consistency: – Reads/writes from a single node execute one at a time – All reads/writes to address X must be ordered by one memory/storage module responsible for X
Realizing sequential consistency )1 (A W Cache or replica I id l a nv , e t a ) B ( R W (B ) 3 W (A )2 Cache Or replica
Disadvantages of sequential consistency • Requires highly available connections – Lots of chatter between clients/servers • Not suitable for certain scenarios: – Disconnected clients (e. g. your laptop) – Apps might prefer potential inconsistency to loss of availability
Why (not) eventual consistency? • Support disconnected operations – Better to read a stale value than nothing – Better to save writes somewhere than nothing • Potentially anomalous application behavior – Stale reads and conflicting writes…
Operating w/o total connectivity Sync w/ server resolves non-conflicting changes, reports conflicting ones to user W(A)1 W(A)2 replica Client writes to its local replica No sync between clients
Pair-wise synchronization Pair-wise sync resolves non-conflicting changes, reports conflicting ones to users W(A)1 replica W(B)3 replica W(A)2 replica
Examples usages? • File synchronizers – One user, many gadgets
File synchronizer • Goal 1. All replica contents eventually become identical 2. No lost updates – Do not replace new version with old ones
Prevent lost updates • Detect if updates were sequential – If so, replace old version with new one – If not, detect conflict • “Optimistic” vs. “Pessimistic” – Eventual Consistency: Let updates happen, worry about whether they can be serialized later – Sequential Consistency: Updates cannot take effect unless they are serialized first
How to prevent lost updates? W(f)a H 1 f mtime: 15648 W(f)b f 16679 W(f)c H 2 12354 f 15648 23657 • Strawman: use mtime to decide which version should replace the other • Problem w/ wallclock: cannot detect disagreement on ordering
Strawman fix W(f)a H 1: 15648 W(f)b H 1: 15648 H 1: 16679 W(f)c H 1: 15648 H 2: 23657 • Carry the entire modification history • If history X is a prefix of Y, Y is newer
Compress version history W(f)a H 1: 1 W(f)b H 1: 2 H 1: 1 H 1: 2 W(f)c H 2 H 1: 1 H 1: 2 implies H 1: 1, so we only need one number per host H 1: 2 H 1: 1 H 1: 2 H 2: 1
Compare vector timestamp H 1: 1 H 2: 3 H 3: 2 < < H 1: 1 H 2: 5 H 3: 7 H 1: 2 H 2: 1 H 3: 7
Using vector timestamp W(f)a H 1: 1 W(f)b H 1: 2 W(f)c H 2 H 1: 1 H 1: 2 H 1: 1 H 2: 1
Using vector timestamp W(f)a H 1: 1 W(f)b H 1: 2 W(f)c H 2 H 1: 1 H 2: 1
How to deal w/ conflicts? • Easy: mailboxes w/ two different set of messages • Medium: changes to different lines of a C source file • Hard: changes to same line of a C source file • After conflict resolution, what should the vector timestamp be?
What about file deletion? • Can we forget about the vector timestamp for deleted files? • Simple solution: treat deletion as a write – Conflicts involving a deleted file is easy • Downside: – Need to remember vector timestamp for deleted files indefinitely
Tra [Cox, Josephson] • What are Tra’s novel properties? – Easy to compress storage of vector timestamps – No need to check every file’s version vector during sync – Allows partial sync of subtrees – No need to keep timestamp for deleted files forever
Tra’s key technique • Two vector timestamps: 1. One represents modification time – Tracks what a host has 2. One represents synchronization time – Tracks what a host knows • Sync time implies no modification happens since mod time H 1: 1 H 2: 5 H 3: 7 H 1: 10 H 2: 20 H 3: 25
Using sync time W(f 1)a H 1 H 2 H 1: 1 f 1 H 1: 1 H 2: 0 W(f 2)b H 1: 2 f 2 H 1: 2 H 2: 0 H 1: 1 f 1 H 1: 0 H 1: 2 H 2: 0 f 2 H 1: 0 H 1: 2 H 2: 0
Compress mtime and synctime • dir synctime = element-wise min of child sync times • dir mtime = element-wise max of child mod times • Sync(d 1 d 1’) – Skip d 1 if mtime of d 1 is less than synctime of d 1’ • Can we achieve this with single mtime? – Skip d 1 if mtime of d 1 is less than mtime of d 1’
Synctime enables partial synchronization • Directory d 1 contains f 1 and f 2, suppose host sync a subtree (d 1/f 1) – With synctime+mtime: synctime of d 1 does not change. Mtime of d 1 increases – With mtime only: Mtime of d 1 increases • Host later syncs subtree d 1/f 2 – With synctime+mtime: will pull in modifications in e 2 because synctime of d 1 is smaller – With mtime only: skips d 1 because mtime is high enough
Using sync time W(f 1)a f 1 H 1: 1 f 2 H 1: 2 d H 1: 0 H 2: 0 only H 1: 1 f 2 H 1: 0 f 1 H 1: 0 H 1: 2 H 1: 0 H 2: 0 c f 2 Syn H 2 only H 1: 2 d H 1: 2 H 2: 0 c f 1 Syn H 1 W(f 2)b f 1 H 1: 1 f 2 H 1: 2 d H 1: 2 H 2: 0
How to deal w/ deletion W(f 1)a H 1 f 1 H 1: 1 D(f 2) f 2 H 1: 2 H 2: 0 Deletion notice for a deleted file contains its sync time d H 1: 2 H 2: 0 H 2 H 1: 1 f 2 H 1: 0 f 1 H 1: 0 H 1: 2 d H 1: 2 H 1: 0 H 2: 0
How to deal w/ deletion W(f 1)a H 1 f 1 H 1: 1 D(f 2) f 2 H 1: 2 H 2: 0 Deletion notice for a deleted file contains its sync time d H 1: 2 H 2: 0 H 2 f 1 H 1: 0 f 2 H 2: 1 f 1 H 1: 1 f 2 H 2: 1 H 1: 0 d H 1: 0 H 2: 1 H 1: 2 d H 1: 2 H 2: 1
Another definition of eventual consistency • Eventual consistency (Tra) – All replica contents are eventually identical – Do not care about individual writes, just overwrite old replica w/ new one • Eventual consistency (Bayou) – Writes are eventually applied in total order – Reads might not see most recent writes in total order
Bayou Write log Version Vector 0: 0 1: 0 2: 0 N 1 0: 0 1: 0 2: 0 N 0 0: 0 1: 0 2: 0 N 2
Bayou propagation 1: 1 W(x) Write log 1: 0 W(x) 2: 0 W(y) 3: 0 W(z) N 0 Version Vector 0: 3 1: 0 2: 0 0: 0 1: 1 2: 0 N 1 1: 0 W(x) 2: 0 W(y) 3: 0 W(z) 0: 3 1: 0 2: 0 0: 0 1: 0 2: 0 N 2
Bayou propagation Write log 1: 0 W(x) 2: 0 W(y) 3: 0 W(z) N 0 Version Vector 0: 3 1: 0 2: 0 1: 0 W(x) 1: 1 W(x) 2: 0 W(y) 3: 0 W(z) N 1 1: 1 W(x) 0: 3 1: 4 2: 0 0: 0 1: 0 2: 0 N 2
Bayou propagation Write log N 0 1: 0 W(x) 1: 1 W(x) 2: 0 W(y) 3: 0 W(z) Version Vector N 1 1: 0 W(x) 1: 1 W(x) 2: 0 W(y) 3: 0 W(z) 0: 3 1: 4 2: 0 0: 4 1: 4 2: 0 Which portion of The log is stable? 0: 0 1: 0 2: 0 N 2
Bayou propagation Write log N 0 1: 0 W(x) 1: 1 W(x) 2: 0 W(y) 3: 0 W(z) Version Vector N 1 1: 0 W(x) 1: 1 W(x) 2: 0 W(y) 3: 0 W(z) 0: 3 1: 4 2: 0 0: 4 1: 4 2: 0 N 2 1: 0 W(x) 1: 1 W(x) 2: 0 W(y) 3: 0 W(z) 0: 3 1: 4 2: 5
Bayou propagation Write log N 0 1: 0 W(x) 1: 1 W(x) 2: 0 W(y) 3: 0 W(z) Version Vector N 1 0: 4 1: 4 2: 0 1: 0 W(x) 1: 1 W(x) 2: 0 W(y) 3: 0 W(z) 0: 3 1: 6 2: 5 0: 3 1: 4 2: 5 N 2 1: 0 W(x) 1: 1 W(x) 2: 0 W(y) 3: 0 W(z) 0: 4 1: 4 2: 5
Bayou uses a primary to commit a total order • Why is it important to make log stable? – Stable writes can be committed – Stable portion of the log can be truncated • Problem: If any node is offline, the stable portion of all logs stops growing • Bayou’s solution: – – A designated primary defines a total commit order Primary assigns CSNs (commit-seq-no) Any write with a known CSN is stable All stable writes are ordered before tentative writes
Bayou propagation ∞: 1: 1 W(x) Write log 1: 1: 0 W(x) 2: 2: 0 W(y) 3: 3: 0 W(z) Version Vector 0: 3 1: 0 2: 0 0: 0 1: 1 2: 0 N 1 ∞: 1: 1 W(x) N 0 0: 0 1: 1 2: 0 0: 0 1: 0 2: 0 N 2
Bayou propagation ∞: 1: 1 W(x) Write log 1: 1: 0 W(x) 2: 2: 0 W(y) 3: 3: 0 W(z) N 0 4: 1: 1 W(x) Version Vector 0: 4 1: 1 2: 0 0: 0 1: 1 2: 0 N 1 1: 1: 0 W(x) 2: 2: 0 W(y) 3: 3: 0 W(z) 4: 1: 1 W(x) 0: 4 1: 1 2: 0 0: 0 1: 0 2: 0 N 2
Bayou’s limitations • Primary cannot fail • Server creation & retirement makes node. ID grow arbitrarily long • Anomalous behaviors for apps? – Calendar app
- Acid vs base
- Dolo direto
- Memory consistency
- Extensive and intensive examples
- Physical property and chemical property
- Consistency
- Inconsistent verb tense
- Product consistency example
- Verb consistency
- Strict consistency
- Verb consistency
- Learning with local and global consistency
- Processor consistency model
- Konsep consistency pada property acid dari sql transaction
- Mobile identity consistency
- Consistency is the last refuge of the unimaginative
- Consistency limits in geotechnical engineering
- Building internally consistent compensation system
- Consistency is king for employee engagement
- Continuous consistency in distributed system
- Logical entailment examples
- Explain data centric consistency model
- Explain data centric consistency model
- Causal consistency in distributed system
- Logic consistent
- Strict consistency
- Verb consistency
- Consistency is an underappreciated
- Overall consistency
- Nache design
- Csp sudoku
- Goods-services continuum
- Distributed shared memory architecture tutorialspoint
- Consistency and replication in distributed systems
- Continuous consistency in distributed system
- Interview questions
- A sales representative wishes to survey
- A reduced stock with a jelly like consistency