Transaction chains achieving serializability with lowlatency in geodistributed
Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos K. Aguilera Jinyang Li New York University *Microsoft Research Silicon Valley
Why geo-distributed storage? Large-scale Web applications Geo-distributed storage Replication
Geo-distribution is hard Low latency: O(Intra-datacenter RTT) Strong semantics: relational tables w/ transactions
Prior work Strict serializable Serializable High latency Provably high latency according to CAP Various non-serializable ? Our work Walter [SOSP’ 11] COPS [SOSP’ 11] Spanner [OSDI’ 12] Eiger [NSDI’ 13] Low latency Dynamo [SOSP’ 07] Eventual Key/value only Limited forms of transaction General transaction
Our contributions 1. A new primitive: transaction chain – Allow for low latency, serializable transactions 1. Lynx geo-storage system: built with chains – Relational tables – Secondary indices, materialized join views
Talk Outline • • Motivation Transaction chains Lynx Evaluation
Why transaction chains? Auction service Items Bidder Alice Item Book Price $100 Seller Item Highest bid Alice i. Phone $20 Bob Book $20 Bob Camera $100 Bob Alice Datacenter-1 Datacenter-2
Why transaction chains? Operation: Alice bids on Bob’s camera 1. Insert bid to Alice’s Bids 2. Update highest bid on Bob’s Items Alice’s Bids Alice Book $100 Bob’s Items Alice Bob Datacenter-1 Camera Datacenter-2 $100
Why transaction chains? Operation: Alice bids on Bob’s camera 1. Insert bid to Alice’s Bids 2. Update highest bid on Bob’s Items Alice’s Bids Alice Book $100 Bob’s Items Alice Bob Datacenter-1 Camera Datacenter-2 $100
Low latency with first-hop return Alice bid on Bob’s camera Alice’s Bids Alice Book $100 Alice Camera $500 Bob’s Items Bob Datacenter-1 Camera Datacenter-2 $100 $500
Problem: what if chains fail? 1. What if servers fail after executing first-hop? 2. What if a chain is aborted in the middle?
Solution: provide all-or-nothing atomicity 1. Chains are durably logged at first-hop – Logs are replicated to another closest data center – Chains are re-executed upon recovery 2. Chains allow user-aborts only at first hop • Guarantee: First hop commits all hops eventually commit
Problem: non-serializable interleaving • Concurrent chains ordered inconsistently at different hops Not serializable! T 1 Server-X: T 1 < T 2 X=1 T 2 Y=1 T 2 X=2 Server-Y: T 2 < T 1 Y=2 Time • Traditional 2 PL+2 PC prevents non-serializable interleaving at the cost of high latency
Solution: detect non-serializable interleaving via static analysis • Statically analyze all chains to be executed – Web applications invoke fixed set of operations T 1 X=1 Y=1 Conflict? T 2 X=2 Y=2 A SC-cycle has both red and blue edges Serializable if no SC-cycle [Shasha et. al TODS’ 95]
Outline • • Motivation Transaction chains Lynx’s design Evaluation
How Lynx uses chains • User chains: used by programmers to implement application logic • System chains: used internally to maintain – Secondary indexes – Materialized join views – Geo-replicas
Example: secondary index Bids (secondary index) Bids (base table) Bidder Item Price Alice Camera $100 Bob i. Phone Bob Car $20 Alice Book Alice i. Phone $20 $100 Bob Car Bob $20 Camera $100
Example user and system chain Alice bid on Bob’s camera Alice Book $100 Alice Camera $100 Bob Datacenter-1 Camera Datacenter-2 $100
Lynx statically analyzes all chains beforehand Read-bids Read Bids table Put-bid Insert to Bids table Update Items table SC-cycle Read-bids Read Bids table One solution: execute chain as a distributed transaction
SC-cycle source #1: false conflicts in user chains Put-bid Insert to Bids table Update Items table False conflict because max(bid, current_price) commutes Put-bid Insert to Bids table Update Items table
Solution: users annotate commutativity Update Items table commutes Put-bid Insert to Bids table Update Items table
SC-cycle source #2: system chains Put-bid Insert to Bids table Bids-secondary … … SC-cycle
Solution: chains provide origin-ordering • Observation: conflicting system chains originate at the same first hop server. T 1 T 2 Insert to Bids table Insert to Bids-secondary Both write the same row of Bids table • Origin-ordering: if chains T 1 < T 2 at same first hop, then T 1 < T 2 at all subsequent overlapping hops. – Can be implemented cheaply sequence number vectors
Limitations of Lynx/chains 1. Chains are not strictly serializable, only serializable. 2. Programmers can abort only at first hop • Our application experience: limitations are managable
Outline • • Motivation Transaction chains Lynx’s design Evaluation
Simple Twitter Clone on Lynx Tweets Geo-replicated Author Tweet Alice New York rocks Bob Time to sleep Eve Hi there Follow-Graph Geo-replicated From To Tweets JOIN Follow-Graph (Timeline) Author (=to) Follow-Graph (secondary) To From Alice Bob Alice Eve Bob Clark From Tweet Bob Alice Time to sleep Eve Alice Hi there
Experimental setup 15 3 m s europe s 102 m Lynx protoype: • In-memory database • Local disk logging only. us-west 82 m s us-east
Returning on first-hop allows low latency 300 Chain completion 252 Latency (ms) 250 200 174 150 100 First hop return 50 0 3. 2 Follow-user Post-tweet Follow-user 3. 1 Post-tweet Read-timeline
Applications achieve good throughput 1. 6 1. 35 Million ops/sec 1. 4 1. 2 1 0. 8 0. 6 0. 4 0. 2 0 0. 184 0. 173 Follow-User Post-Tweet Read-Timeline
Related work • Transaction decomposition – SAGAS [SIGMOD’ 96], step-decomposed transactions • Incremental view maintenance – Views for PNUTS [SIGMOD’ 09] • Various geo-distributed/replicated storage – Spanner[OSDI’ 12], MDCC[Eurosys’ 13], Megastore[CIDR’ 11], COPS [SOSP’ 11], Eiger[NSDI’ 13], Red. Blue[OSDI’ 12].
Conclusion • Chains support serializability at low latency – With static analysis of SC-cycles • Key techniques to reduce SC-cycles – Origin ordering – Commutative annotation • Chains are useful – Performing application logic – Maintaining indices/join views/geo-replicas
Limitations of Lynx/chains 1. Chains are not strict serializable Time Serializable Strict serializable Remedies: – Programmers can wait for chain completion – Lynx provides read-your-own-writes 2. Programmers can only abort at first hop • Our application experience shows the limitations are managable
2 PC and chains The easy way T 1 T 2 T 1 R(A) W(A) R(A) W(B) T 2 T 1 W(A) R(A) W(B) T 1 R(A) 2 PC-W(AB)
2 PC and chains The hard way T 1 T 2 R(A) R(B) W(A) W(B) R(A) T 1 T 2 W(A) R(B) 2 PC-W(AB) W(B) T 1 R(A) R(B)
2 PC and chains The hard way Chain A B C D DC 1 DC 2 DC 3 DC 4 Parallel unlock 2 PC retry
Lynx is scalable 3000 2770 QPS (K/s) 2500 2000 1500 Follow 1350 Tweet 1000 Timeline 586 500 0 265 48 42 1 93 86 184 173 2 4 #Servers per DC 374 356 8
Challenge of static analysis: false conflict T 1 T 2 1. Insert bid into bid history 2. Update max price on item Conflict on bid history Conflict on item 1. Insert bid into bid history 2. Update max price on item SC-cycle Not serializable
Solution: communitivity annotations T 1 1. Insert bid into bid history No real conflict because bid ids are unique T 2 Conflict on Commutative bid history operation 1. Insert bid into bid history 2. Update max price on item Updating max commutes Conflict on Commutative operation item 2. Update max price on item No SC-cycle Serializable
ACID: all-or-nothing atomicity • Chain’s failure guarantee: – If the first hop of a chain commits, then all hops eventually commit • Users are only allowed to abort a chain in the first hop • Achievable with low latency: – Log chains durably at the first hop • Logs replicated to a nearby datacenter – Re-execute stalled chains upon failure recovery
ACID: serializability • Serializability – Execution result appears as if obey a serial order for all transactions – No restrictions on the serial order Transactions Ordering 1 Ordering 2
Problem #2: unsafe interleaving • Serializability – Execution result appears as if obey a serial order for all transactions – No restrictions on the serial order Transactions Ordering 1 Ordering 2
Chains are not linearizable • Serializability • Linearability a total ordering of chains & total order obeys the issue order Transactions Time Ordering 1 Linearizable Ordering 2
Transaction chains: recap • Chains provide all-or-nothing atomicity • Chains ensure serializability via static analysis • Practical challenges: – How to use chains? – How to avoid SC-cycles?
Example user chain Items Bidder Alice Item Seller Price Camera 100 Item Bob Camera Highest 100 Alice 1. Insert bid into Alice’s bid history Bob 2. Update max price on Bob’s camera
Lynx implementation • 5000 lines C++ and 3500 lines RPC library • Uses an in-memory key/value store • Support user chains in Javascript (via V 8)
Geo-distributed storage is hard • Applications demand simplicity & performance – Friendly programming model • Relational tables • Transactions – Fast response • Ideally, operation latency = O(intra-datacenter RTT) • Geo-distribution leads to high latency – Coordinate data access across datacenters • Operation latency = O(inter-datacenter RTT) = O(100 ms)
- Slides: 50