Leader Election Leader Election the idea We study

  • Slides: 59
Download presentation
Leader Election

Leader Election

Leader Election: the idea We study Leader Election in rings

Leader Election: the idea We study Leader Election in rings

Why rings? • historical reasons – original motivation: regenerate lost token in token ring

Why rings? • historical reasons – original motivation: regenerate lost token in token ring networks • illustrates techniques and principles • good for lower bounds and impossibility results

Outline • Specification of Leader Election • YAIR • Leader election in asynchronous rings:

Outline • Specification of Leader Election • YAIR • Leader election in asynchronous rings: • An O(n 2) algorithm • An O(nlog(n)) algorithm • The revenge of the lower bound! • Leader election in synchronous rings • Breaking the W(nlog(n)) barrier

Message passing: Model • n processors p 0, …pn-1 • connected by bi-directional communication

Message passing: Model • n processors p 0, …pn-1 • connected by bi-directional communication channels • topology represented by undirected graph p 0 some links may be missing p 1 p 2 p 4 p 3

Processors Each pi is a state machine • state set Qi • distinguished initial

Processors Each pi is a state machine • state set Qi • distinguished initial states pi’s state includes • outbufi[l]: set of messages sent on l-th channel and not yet delivered • inbufi[l]: set of messages delivered on l-th channel and not yet processed • inbufi initially empty • outbufi not accessible • could be infinite

State Transitions A state transition: • input: accessible state of pi (doesn’t depend on

State Transitions A state transition: • input: accessible state of pi (doesn’t depend on outbufi) • consumes all messages in inbufi • outputs at most a message per channel

Terminology Definition: A configuration is a vector C = (q 0, …, qn-1) •

Terminology Definition: A configuration is a vector C = (q 0, …, qn-1) • • each qi is a state of pi set of outbufi are messages in transit In an initial configuration each qi is an initial state of pi Definition: An event is • • a computation event comp(i) a delivery event del(i, j, m) Definition: An execution is an infinite sequence C 0, f 0, C 1, f 1, … where • • • C 0 is an initial configuration each Ci is a configuration each fi is an event Definition: A schedule for the above execution is the sequence of events f 0, f 1 , …

Safety and Liveness Safety property : “nothing bad happens” • holds in every finite

Safety and Liveness Safety property : “nothing bad happens” • holds in every finite execution prefix – Windows™ never crashes – if one general attacks, both do – a program never terminates with a wrong answer Liveness property: “something good eventually happens” • no partial execution is irremediable – Windows™ always reboots – both generals eventually attack – a program eventually terminates Admissible executions satisfy safety and liveness properties for a particular system type.

A really cool theorem Every property is a combination of a safety property and

A really cool theorem Every property is a combination of a safety property and a liveness property (Alpern and Schneider)

Asynchronous Message-Passing Systems C 0, f 0, C 1, f 1, C 2 …

Asynchronous Message-Passing Systems C 0, f 0, C 1, f 1, C 2 … if fk = del(i, j, m) • • in Ck-1 – m is in outbufi[l], where l is pi’s label for channel {pi, pj} in Ck , – remove m from outbufi[l] – add m to outbufi[h], where h is pi’s label for channel {pi, pj} if fk = comp(i) • • • pi changes state according to its transition function empties inbufi in Ck-1 might add messages to outbufi in Ck Admissible if: • Every processor takes an infinite number of computation steps • Every message sent is eventually delivered

Synchronous Message-Passing Systems C 0, f 0, C 1, f 1, C 2 …

Synchronous Message-Passing Systems C 0, f 0, C 1, f 1, C 2 … • • all asynchronous constraints, plus execution partitioned into disjoint rounds one delivery event for every message in every outbuf followed by one computation event for every processor Remarks • not realistic, but • good for algorithm design • good for lower bounds

Complexity TIME • each processor’s state set includes terminated states • termination: – all

Complexity TIME • each processor’s state set includes terminated states • termination: – all processors in terminated states – no messages in transit Synchronous: count number of rounds until termination Asynchronous: set unit of time as maximum message delay SPACE • Count maximum total number of messages

The Problem • Final states of processes partitioned in two classes: elected non-elected •

The Problem • Final states of processes partitioned in two classes: elected non-elected • Once entered a state, always in that state • In every admissible execution, exactly one process (the leader) enters an elected state. All remaining enter a nonelected state

Lots of variations. . . • The ring can be unidirectional or bidirectional •

Lots of variations. . . • The ring can be unidirectional or bidirectional • The number n of processors may be known or unknown • Processors can be identical or can be somehow distinguished • Communication may be synchronous or asynchronous

Uni- vs. Bidirectional In unidirectional rings, messages can only be sent in a clockwise

Uni- vs. Bidirectional In unidirectional rings, messages can only be sent in a clockwise direction

Can processors be distinguished? If no, anonymous algorithms • Processors have no UID •

Can processors be distinguished? If no, anonymous algorithms • Processors have no UID • Formally: identical automata • Can distinguish between left and right.

Can processors be distinguished? If yes: • processors have unique IDs • chosen from

Can processors be distinguished? If yes: • processors have unique IDs • chosen from some large totally ordered space of ids (e. g. N+) • no constraint on which ID are used (e. g. integers may not be consecutive) • IDs can be either manipulated only by certain operations (e. g. comparison) • or by unrestricted operations

Is n known? If no, uniform algorithms • Algorithm cannot use information about ring

Is n known? If no, uniform algorithms • Algorithm cannot use information about ring size

Communication: Asynchronous vs. Synchronous Asynchronous: • no upper bound on message delivery time •

Communication: Asynchronous vs. Synchronous Asynchronous: • no upper bound on message delivery time • no centralized clock • no bound on relative speed of processes Synchronous: • communication in rounds • In a round a process: – delivers all pending messages – takes an execution step (which may involve sending one or more messages) if no failures, every message sent is eventually delivered

An Impossibility Result Theorem There is no deterministic solution to the leader election problem

An Impossibility Result Theorem There is no deterministic solution to the leader election problem for a synchronous, non-uniform, anonymous bidirectional ring. Proof Suppose that a solution exists for a system A of n > 1 processes. Each process of A starts in the same state Lemma The states of all processors at the end of the each round of the execution of A are the same. Proof By induction on number of rounds k • Base case: k = 0 Easy, since processes start in same state. • Inductive step: Lemma holds for k = t-1 – processors are identical up to round k = t-1 – send same messages to left and right neighbors • every processors receives identical messages on left and right channel – all processors apply same transition function to identical states in round t – all processors have identical states at the end of round t Then, if one enters leader state, all do!

Observations • What are the implication for asynchronous rings? • What are the implication

Observations • What are the implication for asynchronous rings? • What are the implication for uniform rings?

Outline • Specification of Leader Election • YAIR • Leader election in asynchronous rings:

Outline • Specification of Leader Election • YAIR • Leader election in asynchronous rings: • An O(n 2) algorithm • An O(nlog(n)) algorithm • The revenge of the lower bound! • Leader election in synchronous rings • Breaking the W(nlog(n)) barrier

The LCR Algorithm 3: upon receiving m from right Le. Lann (1977), Chang and

The LCR Algorithm 3: upon receiving m from right Le. Lann (1977), Chang and Roberts (1979) • unidirectional • asynchronous • non anonymous: every process has uid • uniform (does not depend on n) 1: upon receiving no message 2: send uidi to left (clockwise) 4: case 5: m. uid > uidi : 6: send m to left 7: m. uid < uidi : 8: discard m 9: m. uid = uidi : 10: leader : = i 11: send <terminate, i> to left 12: terminate endcase 13: upon receiving <terminate, i> from right neighbor 14: leader : = i 15: send <terminate, i>

Correctness • messages from process with highest ID are never discarded • therefore the

Correctness • messages from process with highest ID are never discarded • therefore the correct leader is elected • no other processor ID can traverse the entire ring • therefore no one else is elected

Complexity Message complexity: O(n 2) Time complexity: O(n) This bound is tight… n-1 n-2

Complexity Message complexity: O(n 2) Time complexity: O(n) This bound is tight… n-1 n-2 0 1 2 Can we do better?

The HS algorithm Hirschenberg and Sinclair (1980) • Ring is bidirectional • Each process

The HS algorithm Hirschenberg and Sinclair (1980) • Ring is bidirectional • Each process pi operates in phases • In each phase l, pi sends out “tokens” containing uidi in both directions • Tokens are intended to travel distance 2 l and return to pi • However, tokens may not make it back Phase 210 • Token continues outbound only if greater than tokens on path • Otherwise discarded • All processes always forward tokens moving inbound If pi receives its own token while it is going outbound, pi is the leader

The Protocol 0: Init: asleep : = true 1: upon receiving no message 2:

The Protocol 0: Init: asleep : = true 1: upon receiving no message 2: if asleep then asleep : = false send <uidi, out, 1> to left and 3: upon receiving <uidj, out, h> right from left 4: case 5: uidj > uidi and h>1 : 6: send <uidj, out, h-1> to right 7: uidj > uidi and h=1 : 8: send <uidj, in, 1> to left 9: uidj = uidi : 10: leader : = i 11: endcase 12: upon receiving <uidj, out, h> from right 13: case 14: uidj > uidi and h>1: 15: send <uidj, out, h-1> to left 16: uidj > uidi and h=1: 17: send <uidj, in, 1> to right 18: uidj = uidi 19: upon receiving <uidj, in, 1> leader : = right i 21: from 20: send <uidj, in, 1> endcase 22: to left 23: 24: 25: 26: 27: 28: upon receiving <uidj, in, 1> from left send <uidj, in, 1> to right upon receiving <uidi, in, 1> from left and right phase : = phase +1 send (uidi, out, 2 phase) to left and right

Correctness Same as LCR: • messages from process with highest ID are never discarded

Correctness Same as LCR: • messages from process with highest ID are never discarded • therefore the correct leader is elected • no other processor ID can traverse the entire ring • therefore no one else is elected

Communication Complexity • Every processor sends a token in phase 0 4 n messages

Communication Complexity • Every processor sends a token in phase 0 4 n messages • For phase l > 0, – the only processors to send a tokens are those who “won” in phase l-1 – There is a winner for every 2 +1 processors – Winners in phase l > 0 – Tokens travel distance 2 l – Total number of messages sent in phase l is bounded by • Total number of phases • No. of messages bound by which is O(n log n)

Time Complexity • Time for each phase l 2 · 2 l = 2

Time Complexity • Time for each phase l 2 · 2 l = 2 l+1 • Final phase takes n (tokens only traveling outbound) • Next to last phase is • Total time complexity excluding last phase Time complexity is at most 3 n to 5 n

The revenge of the lower bound So far we have seen: • a simple

The revenge of the lower bound So far we have seen: • a simple O(n 2) algorithm • a more clever O(n log n) algorithm • focus on message complexity Facts: • W(n log n) lower bound in asynchronous networks • W(n log n) lower bound in synchronous networks when using only comparisons

Outline • Specification of Leader Election • YAIR • Leader election in asynchronous rings:

Outline • Specification of Leader Election • YAIR • Leader election in asynchronous rings: • An O(n 2) algorithm • An O(nlog(n)) algorithm • The revenge of the lower bound! • Leader election in synchronous rings • Breaking the W(nlog(n)) barrier • The rise and fall of randomization

Leader Election with fewer than O(n log n) messages • Synchronous rings • UID

Leader Election with fewer than O(n log n) messages • Synchronous rings • UID are positive integers • Can be manipulated using arbitrary arithmetic operations Time. Slice Variable. Speeds • n is known to all processors • n is not known to all processors • unidirectional communication • O(n) messages What about Time complexity?

What is special about synchronous rings? • Can convey information by not sending a

What is special about synchronous rings? • Can convey information by not sending a message “when your phone doesn’t ring, it’s me”

Time. Slice Runs in phases • each phase consists of n rounds • in

Time. Slice Runs in phases • each phase consists of n rounds • in phase i ³ 0 – if no one elected yet – processor with id i – declares itself the leader – sends token with its UID around Message complexity: Time complexity: n n · UIDmin

Variable. Speeds • Each process pi initiates a token • Different tokens travel at

Variable. Speeds • Each process pi initiates a token • Different tokens travel at different speeds: • for token carrying UIDv, 1 message every • (each process waits rounds after receiving the token before sending it out) • Each process keeps track of smallest UID seen • Discard token with UID greater than smallest UID

Complexity Analysis • By the time UIDmin goes around the ring, the second smallest

Complexity Analysis • By the time UIDmin goes around the ring, the second smallest UID has gone only half way, third smallest a fourth of the way, etc. • Forwarding the token carrying UIDmin has caused more messages than all the other tokens combined • Message complexity bound by 2 n • Time Complexity

Variable start times Processors can start at protocol different times • processors that wake

Variable start times Processors can start at protocol different times • processors that wake up spontaneously (participants) send token with UID around ring • processors that wake up on receiving a UID (relays) do not initiate their own token

A message life cycle • A message is in phase one • until it

A message life cycle • A message is in phase one • until it is received by an awake processor • forwarded immediately • A message is in phase two • once received by an awake processor • forwarded after rounds

The New Algorithm When participant receives a message from pi: • if UIDi larger

The New Algorithm When participant receives a message from pi: • if UIDi larger than minimal seen (including own), swallow it • otherwise, delay for rounds When relay receives a message from pi: • if UIDi larger than minimal seen (not including own), swallow it • otherwise, delay for rounds

Correctness Lemma: Only the participant processor with the smallest identifier receives its token back

Correctness Lemma: Only the participant processor with the smallest identifier receives its token back Proof: • • Let pi be participating processor with smallest UID No processor can swallow UIDi All tokens must go through pi , and will be swallowed No other processor can receive token back

Complexity Three categories of messages: • phase one messages • phase two messages sent

Complexity Three categories of messages: • phase one messages • phase two messages sent before the message of eventual leader enters its second phase • phase two messages sent after the eventual leader enters its second phase

Complexity Lemma: The total number of messages in the first category is at most

Complexity Lemma: The total number of messages in the first category is at most n. Proof The lemma follows because at most one phase one message is forwarded by each processor • • • Suppose pi forwards two phase 1 messages, carrying UIDj and UIDk Assume, WLOG, that pj closer to pi than pk. Them, phase 1 message with UIDk must go through pj If pj awake, then it becomes a phase 2 message Otherwise, pj becomes a relay and does not send its UID

Complexity Lemma: The total number of messages in the second category is at most

Complexity Lemma: The total number of messages in the second category is at most n Proof • After the first process awakens, it takes at most n rounds before message with UIDmin reaches a participant • During this time, token with UIDv is responsible for messages at most • Max number of messages obtained when UIDs are small (0, 1, …, n-1) • Max number of messages in second category:

Complexity Lemma: The total number of messages in the third category is at most

Complexity Lemma: The total number of messages in the third category is at most 2 n Proof: analogous to complexity analysis for Variable Speeds In summary: Message Complexity: Time complexity At most 4 n

And now for something completely different. . . RANDOMIZATION

And now for something completely different. . . RANDOMIZATION

Randomized Algorithms Extend transition function to accept as input • a random number •

Randomized Algorithms Extend transition function to accept as input • a random number • from a bounded range • under some fixed distribution

Why is it important? The bad news: randomization alone does not generally affect •

Why is it important? The bad news: randomization alone does not generally affect • impossibility results – leader election in anonymous network still impossible! • worst case bounds The good news: randomization + weakening of problem statement does

Example: Randomized Leader Election • Impossibility in anonymous rings still holds • but can

Example: Randomized Leader Election • Impossibility in anonymous rings still holds • but can now elect a leader with some probability • So weaken LE as follows Safety: In every configuration of every admissible execution, at most one processor is in an elected state Behaviors allowed by weakened specification: Liveness: At least one processor is elected with some non-zero probability • terminate without a leader • never terminate

Back to Leader Election • Use randomization to have processes generate a pseudo identifier

Back to Leader Election • Use randomization to have processes generate a pseudo identifier • Use a deterministic leader election algorithm to work with pseudo identifiers • Not just any deterministic LE algorithm: • needs to work correctly if multiple processes generate same pseudo id • a plus is the ability to detect if no leader elected

A first result Assume • synchronous ring • non-uniform ring • processor can randomly

A first result Assume • synchronous ring • non-uniform ring • processor can randomly choose identifiers Theorem There is a randomized algorithm which, with probability c > 1/e, elects a leader in a synchronous ring; the algorithm sends O(n 2) messages

The Algorithm Code for processor pi Initially 0: pidi : = Observations: 1: send

The Algorithm Code for processor pi Initially 0: pidi : = Observations: 1: send pidi to left • randomization used once 2: upon receiving <S> from right 3: if |S| = n then 4: if pidi is unique max(S) then 5: elected : = true 6: else 7: elected : = false 8: else 9: send • one execution for each element of = {1, 2}n

Definitions • exec(R): execution of R in • Given a predicate P on executions

Definitions • exec(R): execution of R in • Given a predicate P on executions Pr[P]: probability of event {R Î : exec(R) satisfies P}

Analysis What is the probability that the algorithm terminates with a leader? Message Complexity:

Analysis What is the probability that the algorithm terminates with a leader? Message Complexity: O(n 2)

Not good enough? Trade off more time and messages for higher probability of success

Not good enough? Trade off more time and messages for higher probability of success • if |S| = n and pi detects no single max in S – choose new pidi – restart algorithm • becomes a set of n-tuples each of which is a possibly infinite sequence over {1, 2}

Analysis Probability of success in iteration k (1 -c)k-1· c Time complexity: • worst-case

Analysis Probability of success in iteration k (1 -c)k-1· c Time complexity: • worst-case number of iterations: • expected number of iterations: Expected value of T: Expected message complexity: O(n 2)

Impossibility of Uniform Algorithms Theorem There is no uniform randomized algorithm for leader election

Impossibility of Uniform Algorithms Theorem There is no uniform randomized algorithm for leader election in a synchronous anonymous ring that terminates in even a single execution for a single ring size

Summary • No deterministic solution for anonymous rings • No solution for uniform anonymous

Summary • No deterministic solution for anonymous rings • No solution for uniform anonymous rings (even when using randomization) • Protocols with O(n 2) and O(n logn) messages for uniform rings • W(n log n) lower bound on message complexity for practical protocols • O(n) message complexity for uniform synchronous rings