Consistency and Replication CSCI 47806780 Chapter Outline Why

Chapter Outline • Why replication? – Relations to reliability and scalability • How to

Reasons for Replication • Two primary reasons – Improving reliability of system – Improving

Scalability and Performance • Scaling in numbers – Replication can help to scale the

Problems of Replication • Creating and maintaining replicas is not free • Multiple copies

Replication as Scalability Technique • Replication can help to solve geographical scalability problems –

Data-Centric Consistency Models The general organization of a logical data store, physically distributed and

Consistency Models • Contract between processes and data store • If the processes play

Strict Consistency • Most stringent consistency model “Any read on a data item x

Strict Consistency - Illustration Behavior of two processes, operating on the same data item.

Problems with Strict Consistency • Strict consistency poses serious problems for systems with multiple

Slides: 11

Download presentation

Consistency and Replication CSCI 4780/6780

Chapter Outline • Why replication? – Relations to reliability and scalability • How to maintain consistency of replicated data? – Consistency models – Consistency schemes – How to distributed updates and when to distribute them • Examples – Parallel programming – WWW-based systems

Reasons for Replication • Two primary reasons – Improving reliability of system – Improving scalability and performance of system • Reliability – Resilience to failures – Protection against data corruption: Byzantine failures and quorum-based systems

Scalability and Performance • Scaling in numbers – Replication can help to scale the distributed system by numbers – If number of processes accessing data increases, it helps to replicate the data – Example: Parallel programs • Geographical scaling – Placing replica close to process using the data, improves the performance – Example: Edge cache networks, browser caches, etc.

Problems of Replication • Creating and maintaining replicas is not free • Multiple copies leads to consistency problems – What happens when one of the replicas gets modified? – Modifications have to be carried out at all replicas – How and when determines the cost of replication • WWW-based systems – Browser and client side caches – May lead to stale pages – TTL model, Update/Invalidate model

Replication as Scalability Technique • Replication can help to solve geographical scalability problems – Placing replicas closer to clients • Maintaining replicas consistent may place sever overheads – Examples: N accesses and M updates per unit time and N<<M • Problems with multiple copies and tight consistency – Implementing global synchronization • Relaxing consistency requirements is a possible solution

Data-Centric Consistency Models The general organization of a logical data store, physically distributed and replicated across multiple processes.

Consistency Models • Contract between processes and data store • If the processes play by certain rules the store promises to work correctly – Data store guarantees certain properties on the data items stored – Example: A read on a data item would always return the value showing the most recent write • Several data consistency models – Strict consistency, sequential consistency, causal consistency, FIFO consistency

Strict Consistency • Most stringent consistency model “Any read on a data item x returns a value corresponding to the result of the most recent write on x” • Natural and obvious • Uniprocessor systems guarantee strict consistency • Implicitly assumes the existence of absolute global time

Strict Consistency - Illustration Behavior of two processes, operating on the same data item. • A strictly consistent store. • A store that is not strictly consistent.

Problems with Strict Consistency • Strict consistency poses serious problems for systems with multiple machines • Example – Two machines A & B located in different continents & data item x is stored on B. – A performs a read at T 1 and immediately after B performs a write at T 2 – If T 2 – T 1 is very small, the write should complete before T 1 arrives. Else, T 1 reads old value • Problem arises because strict consistency relies on absolute global time – Impossible to assign unique time stamps corresponding to actual global time – Locks do not solve the problem