Chapter 18 Replication Distributed Systems Concepts and Design
Chapter 18: Replication Distributed Systems : Concepts and Design Dr. Ir. H. Sumijan, M. Sc Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 1 A basic architectural model for the management of replicated data Requests and replies C Clients FE Front ends C FE RM RM Service RM Replica managers Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 2 View-synchronous group communication a (allowed). b (allowed). p crashes p p q q r r view (p, q, r) view (q, r) c (disallowed). view (p, q, r) view (q, r) d (disallowed). p crashes p p q q r r view (p, q, r) view (q, r) Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 3 The passive (primary-backup) model for fault tolerance Primary C FE RM RM Backup C FE RM Backup Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 4 Active replication RM C FE RM FE C RM Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 5 Query and update operations in a gossip service Service RM gossip RM Query, prev Val, new Update, prev FE Query RM Update id FE Val Update Clients Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 6 Front ends propagate their timestamps whenever clients communicate directly Service RM RM FE gossip Vector timestamps RM FE Clients Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 7 A gossip replica manager, showing its main state components Other replica managers Gossip messages Replica timestamp Replica log Replica manager Timestamp table Value timestamp Replica timestamp Update log Stable Value updates Executed operation table Updates Operation. ID Update FE Prev FE Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 8 Committed and tentative updates in Bayou Committed c 0 c 1 c 2 Tentative c. N t 0 t 1 t 2 ti ti+1 Tentative update ti becomes the next committed update and is inserted after the last committed update c. N. Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 9 Transactions on replicated data Client + front end U T deposit(B, 3); get. Balance(A) B Replica managers A A A B B B Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 10 Available copies Client + front end T Client + front end U get. Balance(B) deposit(A, 3); get. Balance(A) deposit(B, 3); Replica managers B M Y B B A A X Replica managers P N Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 11 Network partition Client + front end T withdraw(B, 4) Network partition U deposit(B, 3); B Replica managers B B B Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Page 810 Gifford’s quorum concensus examples Example 1 Example 2 Example 3 Latency Replica 1 (milliseconds) Replica 2 Replica 3 Voting Replica 1 configuration Replica 2 Quorum sizes 75 65 65 75 100 75 750 Replica 3 1 0 0 2 1 1 1 R W 1 1 2 3 1 3 Latency Blocking probability 65 0. 01 75 75 Latency Blocking probability 75 0. 01 0. 0002 100 0. 000001 750 0. 0101 0. 03 Derived performance of file suite: Read Write Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 12 Two network partitions Transaction T Network partition Replica managers X V Y Z Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 13 Virtual partition Network partition Replica managers X V Y Z Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 14 Two overlapping virtual partitions Virtual partition V 1 Virtual partition V 2 Y X V Z Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
Figure 18. 15 Creating a virtual partition Phase 1: • The initiator sends a Join request to each potential member. The argument of Join is a proposed logical timestamp for the new virtual partition. • When a replica manager receives a Join request, it compares the proposed logical timestamp with that of its current virtual partition. – If the proposed logical timestamp is greater it agrees to join and replies Yes; – If it is less, it refuses to join and replies No. Phase 2: • If the initiator has received sufficient Yes replies to have read and write quora, it may complete the creation of the new virtual partition by sending a Confirmation message to the sites that agreed to join. The creation timestamp and list of actual members are sent as arguments. • Replica managers receiving the Confirmation message join the new virtual partition and record its creation timestamp and list of actual members. Coulouris G. et al, 2012 : Distributed Systems: Concepts and Design (5 th Edition) 5 th Edition, Edition 5, © Addison-Wesley 2012
- Slides: 17