EEC681781 Distributed Computing Systems Lecture 10 Wenbing Zhao

EEC-681/781 Distributed Computing Systems Lecture 10 Wenbing Zhao wenbing@ieee. org Cleveland State University Fall Semester 2006 EEC-681: Distributed Computing Systems

2 Outline • Clock Synchronization issues • Clock Synchronization Algorithms – Centralized – Distributed • Event ordering and logical clocks • Due date for project progress report – 11/20 Monday mid-night – No extension! Fall Semester 2006 EEC-681: Distributed Computing Systems 2

3 Motivation for Clock Synchronization • In everyday life, we are relying on clocks to coordinate our activities – For example, in EEC 681, we meet every Monday and Wednesday between 6 -7: 50 pm • In computer systems, it is also convenient to use clock as a way to coordinate different activities – For example, the “make” program relies on files’ timestamp to decide if a recompilation is necessary Fall Semester 2006 EEC-681: Distributed Computing Systems 3

4 Motivation for Clock Synchronization • Stock market buy and sell orders • Secure document timestamps (with cryptographic certification) • Aviation traffic control and position reporting • Radio and TV programming launch and monitoring • Intruder detection, location and reporting • Multimedia synchronization for real-time teleconferencing • Network monitoring, measurement and control • Differentiated services traffic engineering 03 -Oct-20 4

5 Universal Coordinated Time • Universal Coordinated Time (UTC): – Based on the number of transitions per second of the cesium 133 atom (pretty accurate) – At present, the real time is taken as the average of some 50 cesium-clocks around the world – Introduces a leap second from time to compensate that days are getting longer • UTC is broadcast through short wave radio and satellite. – Satellites can give an accuracy of about 0. 5 ms Fall Semester 2006 EEC-681: Distributed Computing Systems 5

6 Physical Clocks in Computer Systems • Every machine has a timer that generates an interrupt H times per second • There is a clock in machine p that ticks on each timer interrupt – Denote the value of that clock by Cp(t), where t is UTC time – Ideally, for each machine p, Cp(t) = t, or, in other words, d. C/dt = 1 Fall Semester 2006 EEC-681: Distributed Computing Systems 6

7 Clock Time and UTC 1 - < d. C/dt < 1 + Clock time, C • The relation between clock time and UTC when clocks tick at different rates • In practice: • Maximum drift rate: Fall Semester 2006 EEC-681: Distributed Computing Systems 7

8 Terms • Clock drift rate (or clock accuracy) – The amount of deviation from UTC per unit of time (a day or a week, etc. ) • Clock precision: – Resolution of the clock, e. g. , 1 ms, or 1 ns • Clock skew – The difference in time values of two clocks is called clock skew – Maximum clock skew of a group of clocks is determined by the two clocks that have the largest clock difference Fall Semester 2006 EEC-681: Distributed Computing Systems 8

9 Clock Synchronization • Two clocks are said to be synchronized at a particular instance of time if the clock skew of the two clocks is less than some specified constant δ • A set of clocks are said to be synchronized if the clock skew of any two clocks in this set is less than δ Fall Semester 2006 EEC-681: Distributed Computing Systems 9

10 Clock Synchronization Issues • A distributed system requires: – External Synchronization • Synchronize with an external time source – Internal Synchronization • Clocks within the same network synchronize with each other • Clock Synchronization requires: – Each node can read the other nodes’ clock values • Must consider unpredicted communication delay – Time must never run backward • Smooth adjustments => must maintain the order of the events Fall Semester 2006 EEC-681: Distributed Computing Systems 10

11 Message Propagation Time • Estimate of message propagation time: (T 1 -T 0 -I)/2 Fall Semester 2006 EEC-681: Distributed Computing Systems 11

12 Message Delay Distribution T 2 Server T 3 x q 0 T 1 Fall Semester 2006 Client T 4 EEC-681: Distributed Computing Systems 12

13 Clock Synchronization Algorithms • Centralized – Passive Time Server Centralized Algorithm • Cristian’s algorithm – Active Time Server Centralized Algorithm • Berkeley Algorithm • Distributed – Global Averaging Distributed Algorithms – Localized Averaging Distributed Algorithms Fall Semester 2006 EEC-681: Distributed Computing Systems 13

14 Passive Timer Server • Each node periodically sends a message (time=? ) to the time server at the current local clock time, T 0 • The server responds with a message (time = T), T is the current time of the server • The client receives the message at the local clock time T 1, and adjusts its local clock time to T+(T 1 -T 0 -I)/2 – The time taken by the server to handle the request message is I, T+(T 1 -T 0 -I)/2 – Several measurements of T 1 -T 0, discard the unreliable ones Fall Semester 2006 EEC-681: Distributed Computing Systems 14

15 Active Time Server • The time server periodically broadcasts its clock time (T) • Other nodes receive the message to correct their own clocks – Each node has the knowledge of the approximate time (Ta) required for the propagation of the message, T+Ta – Each nodes replies with the local clock time • The server – Knows the approximate propagation time from each node – Takes fault-tolerant average of clock values as current time – The server adjusts it own, and sends the amount by which each node clock requires adjustment to each node Fall Semester 2006 EEC-681: Distributed Computing Systems 15

Centralized Algorithms – Drawbacks 16 • Single-point failure • Scalability Fall Semester 2006 EEC-681: Distributed Computing Systems 16

17 Global Averaging Distributed Algorithms • Each node broadcasts its local clock time periodically • Each node waits for time T – The node collects the messages broadcast by other nodes – For each message received, the node keeps the local time – At the end of T, the node estimates the skew of its clock with respect to each of the other nodes on the basis of the times at which it received – The node computes a fault-tolerant average of the estimated skews and uses it to adjust its local clock Fall Semester 2006 EEC-681: Distributed Computing Systems 17

18 Localized Averaging Distributed Algorithms • Each node exchanges its clock time with its neighbors • Then sets its clock time to the average of its own clock and the clock times of its neighbors Fall Semester 2006 EEC-681: Distributed Computing Systems 18

19 Exercise • Consider the behavior of two machines in a distributed system. Both have clocks that are supposed to tick 1000 times per millisecond. One of them actually does, but the other ticks only 990 times per millisecond. If UTC updates come in once a minute, what is the maximum clock skew that will occur? Fall Semester 2006 EEC-681: Distributed Computing Systems 19

20 Event Ordering • “Time, Clocks, and the Ordering of Events in a Distributed System”, by Leslie Lamport, Communications of the ACM, July 1978, Volume 21, Number 7, pp. 558 -565 • He showed that it is possible to synchronize all the clocks to produce a single, unambiguous time standard • He pointed out the clock synchronization need not to be absolute – What usually matters is not that all processes agree on exactly what time it is, but rather, that they agree on the order in which events occur Fall Semester 2006 EEC-681: Distributed Computing Systems 20

21 Happens-Before Relation • With perfectly accurate physical time – An event a happened before an event b if a happened at an earlier time than b • Without using the physical clocks – Assume that the system is composed of a collection of processes, each process consists of a sequence of events – The events of a process form a sequence, where a occurs before b in this sequence if a happens before b – Assume sending and receiving a message is an event in a process Fall Semester 2006 EEC-681: Distributed Computing Systems 21

22 Happens-before Relation • “Happens-before” relation, denoted by “→”, is defined as follows: – The relation “→” on the set of events of a system is the relation satisfying the following three conditions: • If a and b are events in the same process, and a comes before b, then a → b • If a is the sending of a message by one process and b is the receipt of the same message by another process, then a→b • If a → b and b → c, then a → c – Event a causally affects event b Fall Semester 2006 EEC-681: Distributed Computing Systems 22

23 Partial Ordering • Two distinct events a and b are said to be concurrent if a → b and b → a – Neither event can causally affect the other – This introduces a partial ordering of events in a system with concurrently operating processes Fall Semester 2006 EEC-681: Distributed Computing Systems 23

24 Logical Clocks • Logical clocks: Use the clock just as a way of assigning a number to an event, where the number is the time at which the event occurs – Define a clock Ci for each process Pi • Assigns a number Ci(a) to any event a in that process • The entire system of clocks is represented by the function C which assigns to any event b the number C(b), where C(b) =Cj(b) if b is an event in process Pj • The clocks Ci are logical clocks rather than physical clocks Fall Semester 2006 EEC-681: Distributed Computing Systems 24

25 Implementation of Logical Clocks • The logical clocks is correct if the events of the system that are related to each other by the happens-before relation can be properly ordered using these clocks • Clock condition: – For any event a, b, if a b then C(a) <C(b) Fall Semester 2006 EEC-681: Distributed Computing Systems 25

26 Implementation of Logical Clocks • According to our definition of the happens-before relation, the clock condition is satisfied if the following two conditions hold: – C 1: if a and b are events in process Pi, and a comes before b, then Ci(a) < Ci(b) – C 2: if a is the sending of a message by process Pi and b is the receipt of that message by process Pj, then Ci(a) < Cj(b) Fall Semester 2006 EEC-681: Distributed Computing Systems 26

27 Implementation of Logical Clock • To meet C 1: – Each process Pi increments Ci between any two successive events • To meet C 2: – (a) if event a is the sending of a message m by process Pi, then the message m contains a timestamp Tm = Ci(a). – (b) Upon receiving a message m, process Pj sets Cj greater than or equal to its present value and greater than Tm Fall Semester 2006 EEC-681: Distributed Computing Systems 27

28 Implementation of Logical Clocks by Counters • A Lamport logical clock is a monotonically increasing software counter • Each process Pi keeps its own logical clock Ci which is used to apply Lamport timestamps to events • To capture the happens-before relation →, processes update their logical clocks and transmit the values of their logical clocks in messages as follows: – Before each event at Pi: Ci : = Ci+1 – When Pi sends a message m, it piggybacks t = Ci – When Pj receives (m, t): Cj : = max(Cj, t) + 1 • e → e’ => C(e) < C(e’) Fall Semester 2006 EEC-681: Distributed Computing Systems 28

Implementation of Logical Clocks by Counters 29 Question: Which two events are concurrent? Fall Semester 2006 EEC-681: Distributed Computing Systems 29

30 Total Ordering of Events • We can use the logical clocks satisfying the Clock Condition to place a total ordering on the set of all system events – Simply order the events by the times at which occur – To break the ties, Lamport proposed the use of any arbitrary total ordering of the processes, i. e. process id Fall Semester 2006 EEC-681: Distributed Computing Systems 30

31 Total Ordering of Events • Using this method, we can assign a unique timestamp to each event in a distributed system to provide a total ordering of all events • Very useful in distributed system – Solving the mutual exclusion problem – Totally ordered reliable multicast => needed to build fault tolerant systems Fall Semester 2006 EEC-681: Distributed Computing Systems 31