FaultTolerant Consensus 1 Communication Model Complete graph Synchronous

  • Slides: 122
Download presentation
Fault-Tolerant Consensus 1

Fault-Tolerant Consensus 1

Communication Model • Complete graph • Synchronous, network 2

Communication Model • Complete graph • Synchronous, network 2

Broadcast a a Send a message to all processors in one round 3

Broadcast a a Send a message to all processors in one round 3

a a At the end of round: everybody receives a 4

a a At the end of round: everybody receives a 4

Broadcast a b a b Two or more processes can broadcast at the same

Broadcast a b a b Two or more processes can broadcast at the same round 5

a, b b a, b a a, b 6

a, b b a, b a a, b 6

Crash Failures a Faulty processor a a a 7

Crash Failures a Faulty processor a a a 7

Faulty processor a a Some of the messages are lost, they are never received

Faulty processor a a Some of the messages are lost, they are never received 8

a Faulty processor a 9

a Faulty processor a 9

Round Round 1 2 3 4 5 Failure After failure the process disappears from

Round Round 1 2 3 4 5 Failure After failure the process disappears from the network 10

Consensus 0 Start 1 4 2 3 Everybody has an initial value 11

Consensus 0 Start 1 4 2 3 Everybody has an initial value 11

3 Finish 3 3 Everybody must decide the same value 12

3 Finish 3 3 Everybody must decide the same value 12

Validity condition: If everybody starts with the same value they must decide that value

Validity condition: If everybody starts with the same value they must decide that value Start Finish 1 1 13

A simple algorithm Each processor: 1. Broadcast value to all processors 2. Decide on

A simple algorithm Each processor: 1. Broadcast value to all processors 2. Decide on the minimum (only one round is needed) 14

Start 0 1 4 2 3 15

Start 0 1 4 2 3 15

Broadcast values 0, 1, 2, 3, 4 1 4 2 3 0, 1, 2,

Broadcast values 0, 1, 2, 3, 4 1 4 2 3 0, 1, 2, 3, 4 16

Decide on minimum 0, 1, 2, 3, 4 0 0 0, 1, 2, 3,

Decide on minimum 0, 1, 2, 3, 4 0 0 0, 1, 2, 3, 4 17

Finish 0 0 0 18

Finish 0 0 0 18

This algorithm satisfies the validity condition Start Finish 1 1 1 1 1 If

This algorithm satisfies the validity condition Start Finish 1 1 1 1 1 If everybody starts with the same initial value, everybody decides on that value (minimum) 19

Consensus with Crash Failures The simple algorithm doesn’t work Each processor: 1. Broadcast value

Consensus with Crash Failures The simple algorithm doesn’t work Each processor: 1. Broadcast value to all processors 2. Decide on the minimum 20

Start fail 0 1 0 0 2 4 3 The failed processor doesn’t broadcast

Start fail 0 1 0 0 2 4 3 The failed processor doesn’t broadcast Its value to all processors 21

Broadcasted values fail 0 0, 1, 2, 3, 4 1 1, 2, 3, 4

Broadcasted values fail 0 0, 1, 2, 3, 4 1 1, 2, 3, 4 4 2 3 1, 2, 3, 4 0, 1, 2, 3, 4 22

Decide on minimum fail 0 0, 1, 2, 3, 4 0 1, 2, 3,

Decide on minimum fail 0 0, 1, 2, 3, 4 0 1, 2, 3, 4 1 1 0 1, 2, 3, 4 0, 1, 2, 3, 4 23

Finish fail 0 0 1 1 0 No Consensus!!! 24

Finish fail 0 0 1 1 0 No Consensus!!! 24

If an alforithm solves consensus for f failed process we say it is: an

If an alforithm solves consensus for f failed process we say it is: an f-resilient consensus algorithm 25

Example: The input and output of a 3 -resilient consensus algorithm Start Finish 0

Example: The input and output of a 3 -resilient consensus algorithm Start Finish 0 1 1 2 4 3 1 26

An f-resilient algorithm Round 1: Broadcast my value Round 2 to round f+1: Broadcast

An f-resilient algorithm Round 1: Broadcast my value Round 2 to round f+1: Broadcast any new received values End of round f+1: Decide on the minimum value received 27

Example: f=1 failures, f+1 = 2 rounds needed Start 0 1 4 2 3

Example: f=1 failures, f+1 = 2 rounds needed Start 0 1 4 2 3 28

Example: f=1 failures, f+1 = 2 rounds needed Round 1 0 fail 0 0,

Example: f=1 failures, f+1 = 2 rounds needed Round 1 0 fail 0 0, 1, 2, 3, 4 1 0 (new values) 1, 2, 3, 4 2 3 4 1, 2, 3, 4 0, 1, 2, 3, 4 Broadcast all values to everybody 29

Example: f=1 failures, f+1 = 2 rounds needed Round 2 0, 1, 2, 3,

Example: f=1 failures, f+1 = 2 rounds needed Round 2 0, 1, 2, 3, 4 1 0, 1, 2, 3, 4 4 2 3 0, 1, 2, 3, 4 Broadcast all new values to everybody 30

Example: f=1 failures, f+1 = 2 rounds needed Finish 0, 1, 2, 3, 4

Example: f=1 failures, f+1 = 2 rounds needed Finish 0, 1, 2, 3, 4 0 0 0 0, 1, 2, 3, 4 Decide on minimum value 31

Example: f=2 failures, f+1 = 3 rounds needed Start 0 1 4 2 3

Example: f=2 failures, f+1 = 3 rounds needed Start 0 1 4 2 3 Another example execution with 3 failures 32

Example: f=2 failures, f+1 = 3 rounds needed Round 1 0 Failure 1 1,

Example: f=2 failures, f+1 = 3 rounds needed Round 1 0 Failure 1 1, 2, 3, 4 0 2 3 4 1, 2, 3, 4 0, 1, 2, 3, 4 Broadcast all values to everybody 33

Example: f=2 failures, f+1 = 3 rounds needed Round 2 0 Failure 1 0,

Example: f=2 failures, f+1 = 3 rounds needed Round 2 0 Failure 1 0, 1, 2, 3, 4 1 1, 2, 3, 4 4 2 3 1, 2, 3, 4 0, 1, 2, 3, 4 Failure 2 Broadcast new values to everybody 34

Example: f=2 failures, f+1 = 3 rounds needed Round 3 0 Failure 1 0,

Example: f=2 failures, f+1 = 3 rounds needed Round 3 0 Failure 1 0, 1, 2, 3, 4 4 2 3 O, 1, 2, 3, 4 0, 1, 2, 3, 4 Failure 2 Broadcast new values to everybody 35

Example: f=2 failures, f+1 = 3 rounds needed Finish 0 Failure 1 0, 1,

Example: f=2 failures, f+1 = 3 rounds needed Finish 0 Failure 1 0, 1, 2, 3, 4 0 0 3 O, 1, 2, 3, 4 0, 1, 2, 3, 4 Failure 2 Decide on the minimum value 36

Example: f=2 failures, f+1 = 3 rounds needed Start 0 1 4 2 3

Example: f=2 failures, f+1 = 3 rounds needed Start 0 1 4 2 3 Another example execution with 3 failures 37

Example: f=2 failures, f+1 = 3 rounds needed Round 1 0 Failure 1 1,

Example: f=2 failures, f+1 = 3 rounds needed Round 1 0 Failure 1 1, 2, 3, 4 0 2 3 4 1, 2, 3, 4 0, 1, 2, 3, 4 Broadcast all values to everybody 38

Example: f=2 failures, f+1 = 3 rounds needed Round 2 0 Failure 1 0,

Example: f=2 failures, f+1 = 3 rounds needed Round 2 0 Failure 1 0, 1, 2, 3, 4 4 0, 1, 2, 3, 4 3 2 Broadcast new values to everybody Remark: At the end of this round all processes know about all the other values 39

Example: f=2 failures, f+1 = 3 rounds needed Round 3 0 Failure 1 0,

Example: f=2 failures, f+1 = 3 rounds needed Round 3 0 Failure 1 0, 1, 2, 3, 4 4 2 3 0, 1, 2, 3, 4 Failure 2 Broadcast new values to everybody (no new values are learned in this round) 40

Example: f=2 failures, f+1 = 3 rounds needed Finish 0 Failure 1 0, 1,

Example: f=2 failures, f+1 = 3 rounds needed Finish 0 Failure 1 0, 1, 2, 3, 4 0 0 3 0, 1, 2, 3, 4 Failure 2 Decide on minimum value 41

If there are f failures and f+1 rounds then there is a round with

If there are f failures and f+1 rounds then there is a round with no failed process Round 1 2 3 4 5 6 Example: 5 failures, 6 rounds No failure 42

In the algorithm, at the end of the round with no failure: • Every

In the algorithm, at the end of the round with no failure: • Every (non faulty) process knows about all the values of all other participating processes • This knowledge doesn’t change until the end of the algorithm 43

Therefore, at the end of the round with no failure: everybody would decide the

Therefore, at the end of the round with no failure: everybody would decide the same value However, we don’t know the exact position of this round, so we have to let the algorithm execute for f+1 rounds 44

Validity of algorithm: when all processes start with the same input value then the

Validity of algorithm: when all processes start with the same input value then the consensus is that value This holds, since the value decided from each process is some input value 45

A Lower Bound Theorem: Any f-resilient consensus algorithm requires at least f+1 rounds 46

A Lower Bound Theorem: Any f-resilient consensus algorithm requires at least f+1 rounds 46

Proof sketch: Assume for contradiction that f or less rounds are enough Worst case

Proof sketch: Assume for contradiction that f or less rounds are enough Worst case scenario: There is a process that fails in each round 47

Worst case scenario Round 1 a before process fails, it sends its value a

Worst case scenario Round 1 a before process fails, it sends its value a to only one process 48

Worst case scenario Round 1 2 a before process fails, it sends value a

Worst case scenario Round 1 2 a before process fails, it sends value a to only one process 49

Worst case scenario Round 1 2 3 f ……… a At the end of

Worst case scenario Round 1 2 3 f ……… a At the end of round f only one process knows about value a 50

Worst case scenario Round 1 2 3 f decide b ……… a Process may

Worst case scenario Round 1 2 3 f decide b ……… a Process may decide a, and all other processes may decide another value (b) 51

Worst case scenario Round 1 2 3 f decide b ……… a Therefore f

Worst case scenario Round 1 2 3 f decide b ……… a Therefore f rounds are not enough At least f+1 rounds are needed 52

Byzantine Failures 53

Byzantine Failures 53

Byzantine Failures a Faulty processor b c a Different processes receive different values 54

Byzantine Failures a Faulty processor b c a Different processes receive different values 54

Some messages may be lost Faulty processor a a A Byzantine process can behave

Some messages may be lost Faulty processor a a A Byzantine process can behave like a Crashed-failed process 55

Round 1 2 3 4 Failure Round 5 6 Failure After failure the process

Round 1 2 3 4 Failure Round 5 6 Failure After failure the process continues Functioning in the network 56

Consensus with Byzantine Failures f-resilient consensus algorithm: solves consensus for f failed processes 57

Consensus with Byzantine Failures f-resilient consensus algorithm: solves consensus for f failed processes 57

Example: The input and output of a 1 -resilient consensus algorithm Start Finish 0

Example: The input and output of a 1 -resilient consensus algorithm Start Finish 0 1 2 4 3 3 3 58

Validity condition: if all non-faulty processes start with the same value then all non-faulty

Validity condition: if all non-faulty processes start with the same value then all non-faulty processes decide that value Start Finish 1 1 1 1 1 59

Lower bound on number of rounds Theorem: Any f-resilient consensus algorithm with byzantine failures

Lower bound on number of rounds Theorem: Any f-resilient consensus algorithm with byzantine failures requires at least f+1 rounds Proof: follows from the crash failure lower bound 60

A Consensus Algorithm The King algorithm solves consensus with processes and failures, where 61

A Consensus Algorithm The King algorithm solves consensus with processes and failures, where 61

The King algorithm There are phases Each phase has two broadcast rounds In each

The King algorithm There are phases Each phase has two broadcast rounds In each phase there is a different king 62

Example: 12 processes, 2 faults, 3 kings initial values 0 1 1 2 1

Example: 12 processes, 2 faults, 3 kings initial values 0 1 1 2 1 0 2 0 1 0 Faulty 63

Example: 12 processes, 2 faults, 3 kings initial values 0 1 1 King 1

Example: 12 processes, 2 faults, 3 kings initial values 0 1 1 King 1 2 1 0 2 0 King 2 1 0 King 3 Remark: There is a king that is not faulty 64

The King algorithm Each processor has a preferred value In the beginning, the preferred

The King algorithm Each processor has a preferred value In the beginning, the preferred value is set to the initial value 65

The King algorithm Round 1, processor Phase k : • Broadcast preferred value •

The King algorithm Round 1, processor Phase k : • Broadcast preferred value • Let be the majority of received values (including ) (in case of tie pick an arbitrary value) • Set 66

The King algorithm Round 2, king Phase k : Broadcast new preferred value Round

The King algorithm Round 2, king Phase k : Broadcast new preferred value Round 2, process If : had majority of less than then set 67

The King algorithm End of Phase f+1: Each process decides on preferred value 68

The King algorithm End of Phase f+1: Each process decides on preferred value 68

Example: 6 processes, 1 fault 0 1 0 2 1 1 king 2 king

Example: 6 processes, 1 fault 0 1 0 2 1 1 king 2 king 1 Faulty 69

Phase 1, Round 1 2, 1, 1, 0, 0, 0 2, 1, 1, 1,

Phase 1, Round 1 2, 1, 1, 0, 0, 0 2, 1, 1, 1, 0, 0 0 2, 1, 1, 0, 0, 0 0 2, 1, 1, 1, 0, 0 0 1 1 0 1 2 2, 1, 1, 0, 0, 0 0 king 1 Everybody broadcasts 70

Phase 1, Round 1 Chose the majority 1 0 0 0 1 1 king

Phase 1, Round 1 Chose the majority 1 0 0 0 1 1 king 1 2, 1, 1, 1, 0, 0 Each majority vote was On round 2, everybody will chose the king’s value 71

Phase 1, Round 2 1 0 0 1 0 1 1 0 2 king

Phase 1, Round 2 1 0 0 1 0 1 1 0 2 king 1 The king broadcasts 72

Phase 1, Round 2 0 1 0 2 1 1 king 1 Everybody chooses

Phase 1, Round 2 0 1 0 2 1 1 king 1 Everybody chooses the king’s value 73

Phase 2, Round 1 2, 1, 1, 0, 0, 0 2, 1, 1, 1,

Phase 2, Round 1 2, 1, 1, 0, 0, 0 2, 1, 1, 1, 0, 0 0 2, 1, 1, 0, 0, 0 0 2, 1, 1, 1, 0, 0 0 1 1 0 1 2 0 2, 1, 1, 0, 0, 0 king 2 Everybody broadcasts 74

Phase 2, Round 1 1 Chose the majority 0 0 0 1 1 king

Phase 2, Round 1 1 Chose the majority 0 0 0 1 1 king 2 2, 1, 1, 1, 0, 0 Each majority vote was On round 2, everybody will chose the king’s value 75

Phase 2, Round 2 1 0 0 0 1 0 0 king 2 The

Phase 2, Round 2 1 0 0 0 1 0 0 king 2 The king broadcasts 76

Phase 2, Round 2 0 0 0 king 2 1 Everybody chooses the king’s

Phase 2, Round 2 0 0 0 king 2 1 Everybody chooses the king’s value Final decision 77

Theorem: In the phase where the king is non-faulty, every non-faulty processor decides the

Theorem: In the phase where the king is non-faulty, every non-faulty processor decides the same value Proof: Consider phase 78

At the end of round 1, we examine two cases: Case 1: some node

At the end of round 1, we examine two cases: Case 1: some node has chosen its preferred value with strong majority ( votes) Case 2: No node has chosen its preferred value with strong majority 79

Case 1: suppose node has chosen its preferred value with strong majority ( votes)

Case 1: suppose node has chosen its preferred value with strong majority ( votes) At the end of round 1, every other node must have preferred value (including the king) Explanation: At least non-faulty nodes must have broadcasted at start of round 1 80

At end of round 2: If a node keeps its own value: then decides

At end of round 2: If a node keeps its own value: then decides If a node gets the value of the king: then it decides , since the king has decided Therefore: Every non-faulty node decides 81

Case 2: No node has chosen its preferred value with strong majority ( votes)

Case 2: No node has chosen its preferred value with strong majority ( votes) Every non-faulty node will adopt the value of the king, thus all decide on same value END of PROOF 82

Let be the value decided at the end of phase After , value will

Let be the value decided at the end of phase After , value will always be preferred with strong majority, since the number of non-faulty processors is: (since ) 83

Thus, from until the end of phase Every non-faulty processor decides 84

Thus, from until the end of phase Every non-faulty processor decides 84

An Impossibility Result Theorem: There is no -resilient algorithm for processes, where Proof: First

An Impossibility Result Theorem: There is no -resilient algorithm for processes, where Proof: First we prove the 3 process case, and then the general case 85

The 3 processes case Lemma: There is no 1 -resilient algorithm for 3 processes

The 3 processes case Lemma: There is no 1 -resilient algorithm for 3 processes Proof: Assume for contradiction that there is a 1 -resilient algorithm for 3 processes 86

Local algorithm B(1) A(0) C(0) Initial value 87

Local algorithm B(1) A(0) C(0) Initial value 87

1 1 1 Decision value 88

1 1 1 Decision value 88

A(0) C(1) B(0) B(1) C(0) A(1) Assume processes are in a ring Processes think

A(0) C(1) B(0) B(1) C(0) A(1) Assume processes are in a ring Processes think they are in a triangle 89

A(0) C(1) B(0) B(1) C(0) A(1) C(1) C(0) faulty 90

A(0) C(1) B(0) B(1) C(0) A(1) C(1) C(0) faulty 90

A(0) C(1) B(0) B(1) 1 C(0) A(1) 1 faulty (validity condition) 91

A(0) C(1) B(0) B(1) 1 C(0) A(1) 1 faulty (validity condition) 91

A(0) C(1) B(0) C(0) A(1) C(0) A(0) B(1) 1 A(1) faulty 92

A(0) C(1) B(0) C(0) A(1) C(0) A(0) B(1) 1 A(1) faulty 92

A(0) C(1) B(0) 0 B(1) C(0) A(1) 0 1 faulty (validity condition) 93

A(0) C(1) B(0) 0 B(1) C(0) A(1) 0 1 faulty (validity condition) 93

A(0) C(1) B(0) B(1) C(0) A(1) 0 1 C(0) A(1) B(0) B(1) faulty 94

A(0) C(1) B(0) B(1) C(0) A(1) 0 1 C(0) A(1) B(0) B(1) faulty 94

A(0) C(1) B(0) 0 A(1) faulty C(0) A(1) C(0) A(0) B(1) 0 C(0) A(1)

A(0) C(1) B(0) 0 A(1) faulty C(0) A(1) C(0) A(0) B(1) 0 C(0) A(1) B(0) 1 1 B(1) C(1) A(1) C(0) faulty B(1) faulty 95

A(0) C(1) B(0) B(1) C(0) A(1) 0 1 faulty 96

A(0) C(1) B(0) B(1) C(0) A(1) 0 1 faulty 96

Impossible!!! since the algorithm is 1 -resilient 0 1 faulty 97

Impossible!!! since the algorithm is 1 -resilient 0 1 faulty 97

Therefore: There is no algorithm that solves consensus for 3 processes in which 1

Therefore: There is no algorithm that solves consensus for 3 processes in which 1 is a byzantine process 98

The n processes case Assume for contradiction that there is an -resilient algorithm A

The n processes case Assume for contradiction that there is an -resilient algorithm A for processes, where We will use algorithm A to solve consensus for 3 processes and 1 failure (contradiction) 99

algorithm A start 0 1 1 2 1 0 2 0 … 1 0

algorithm A start 0 1 1 2 1 0 2 0 … 1 0 1 … failures finish 1 1 1 … 100

Each process on of simulates algorithm A processes 101

Each process on of simulates algorithm A processes 101

fails When a then fails of processes fail too 102

fails When a then fails of processes fail too 102

Finish of algorithm A k k k all decide k kk k k fails

Finish of algorithm A k k k all decide k kk k k fails algorithm A tolerates failures 103

Final decision k k fails We reached consensus with 1 failure Impossible!!! 104

Final decision k k fails We reached consensus with 1 failure Impossible!!! 104

Threrefore: There is no -resilient algorithm for processes, where 105

Threrefore: There is no -resilient algorithm for processes, where 105

Randomized Byzantine Agreement There is a trustworthy processor which at every round throws a

Randomized Byzantine Agreement There is a trustworthy processor which at every round throws a random coin and informs every other processor Coin = heads (probability Coin = tails (probability ) ) 106

Each processor has a preferred value In the beginning, the preferred value is set

Each processor has a preferred value In the beginning, the preferred value is set to the initial value Assume that initial value is binary 107

The algorithm tolerates Byzantine processors There are threshold values: 108

The algorithm tolerates Byzantine processors There are threshold values: 108

In each round, processor executes: Broadcast ; Receive values from all processors; majority value;

In each round, processor executes: Broadcast ; Receive values from all processors; majority value; occurrences of ; If coin=heads then else If If then else then decision is reached 109

Analysis: Examine two cases in a round Termination: There is a processor with Other

Analysis: Examine two cases in a round Termination: There is a processor with Other cases: Case 1: Two processors different and have Case 2: All processors have same 110

Termination: There is a processor with Since faulty processors are at most processor votes

Termination: There is a processor with Since faulty processors are at most processor votes for received at least from good processors 111

Therefore, every processor will have with Consequently, at the end of the round all

Therefore, every processor will have with Consequently, at the end of the round all the good processors will have the same preferred value: 112

Observation: If in the beginning of a round all the good processors have same

Observation: If in the beginning of a round all the good processors have same preferred value then the algorithm terminates in that round This holds since for every processor the termination condition will be true in that round 113

Therefore, if the termination condition is true for one processor at a round, then,

Therefore, if the termination condition is true for one processor at a round, then, the termination condition will be true for all processors at next round. 114

Case 1: Two processors different and have It has to be that and And

Case 1: Two processors different and have It has to be that and And therefore Thus, every processor chooses 0, and the algorithm terminates in next round 115

Suppose (for sake of contradiction) that Then at least Good processors have voted Consequently,

Suppose (for sake of contradiction) that Then at least Good processors have voted Consequently, Contradiction! 116

Case 2: All processors have same Then for any two processors it holds that

Case 2: All processors have same Then for any two processors it holds that and Since otherwise, the number of faulty Processors would exceed 117

Let be the processor with 118

Let be the processor with 118

Sub-case 1: If (this occurs with probability then, for any processor ) it holds

Sub-case 1: If (this occurs with probability then, for any processor ) it holds 119

And therefore Thus, every processor chooses 0, and the algorithm terminates in next round

And therefore Thus, every processor chooses 0, and the algorithm terminates in next round (this occurs with probability ) 120

Sub-case 2: If (this occurs with probability then, for any processor ) it holds

Sub-case 2: If (this occurs with probability then, for any processor ) it holds 121

And therefore Thus, every processor chooses , and the algorithm terminates in next round

And therefore Thus, every processor chooses , and the algorithm terminates in next round (this occurs with probability ) 122