EEC681781 Distributed Computing Systems Lecture 2 Wenbing Zhao

EEC-681/781 Distributed Computing Systems Lecture 2 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee. org Fall Semester 2006 EEC-681: Distributed Computing Systems

Advertisement My Research Projects 2 • Secure and Dependable Web Services – Reliable group communication for WS – Byzantine fault tolerance for WS – Reservation-based extended transactions • Randomized Service Migration • Byzantine Fault Tolerant Database Systems • Secure and Dependable Vo. IP Fall Semester 2006 EEC-681: Distributed Computing Systems 2

3 Outline • Overview of distributed systems – Definition of distributed systems – Design Goals Fall Semester 2006 EEC-681: Distributed Computing Systems 3

4 Definition of a Distributed System • A collection of independent computers that appear to the users as a single coherent system – Autonomous computers connected by a network – Software specifically designed to provide an integrated computing facility => Our focus! Fall Semester 2006 EEC-681: Distributed Computing Systems 4

5 Definition of a Distributed System • Two aspects (1) independent computers and (2) single system middleware 1. 1 Distributed Applications Middleware Services Fall Semester 2006 EEC-681: Distributed Computing Systems 5

6 Definition of a Distributed System “You know you have a distributed system when the crash of a computer you’ve never heard of stops you from getting any work done. ” – Leslie Lamport • What does that imply? – Inter-dependencies – Shared state – Failure handling is difficult • Trade-off: strong coherency => strong coupling! Fall Semester 2006 EEC-681: Distributed Computing Systems 6

7 Distributed Application Examples • The World Wide Web • Automated banking systems • Global positioning systems • Retail point-of-sale terminals • Air-traffic control • Peer-to-peer file sharing systems Fall Semester 2006 EEC-681: Distributed Computing Systems 7

8 Motivation for Distribution • Share resources • Personalize environments • Location independence • People & information are distributed • Performance & cost • Modularity & expandability • Availability & reliability • Scalability Fall Semester 2006 EEC-681: Distributed Computing Systems 8

9 Design Goals • Connecting users and resources: – Reduce cost, increased connectivity and sharing • Transparency: – Users feel like they are using a single-user system • Openness: – Services provided are based on standards Fall Semester 2006 EEC-681: Distributed Computing Systems 9

10 Design Goals • Flexibility: – Separation of policy and mechanisms • Scalability: – Able to support large number nodes, etc. • Availability: – Fault tolerant • Security: – Confidentiality and Integrity Fall Semester 2006 EEC-681: Distributed Computing Systems 10

11 Distribution Transparency • Access transparency • Location transparency • Migration transparency • Relocation transparency • Replication transparency • Concurrency transparency • Persistency transparency Fall Semester 2006 EEC-681: Distributed Computing Systems 11

12 Access Transparency • Hide differences in data representation and how a resource is accessed • Ex 1: Byte ordering of integer types – Sun Sparc processor: big endian (low order byte is transmitted first) – Intel processor: little endian (high order byte is transmitted first) Fall Semester 2006 EEC-681: Distributed Computing Systems 12

13 Access Transparency • Ex 2: File systems – Windows uses NTFS, FAT 32, FAT 16 etc. A file is accessed using the form C: a foldera file. ext – Linux uses Ext 3, Reiser, XFS, Ext 2 etc. A file is accessed using the form /home/username/directory/file. ext Fall Semester 2006 EEC-681: Distributed Computing Systems 13

14 Access Transparency • Example: Web application – You fetch a Web page, it is displayed in your browser momentarily – It doesn’t matter if your system is the same as that of the Web server Fall Semester 2006 EEC-681: Distributed Computing Systems 14

15 Location Transparency • Location transparency - Hide where a resource is located – Logical name, instead of physical name, is used to refer to resources • Example: WWW – You use domain name to refer to a resource, e. g. , http: //www. google. com points to the main google page – You do not need to know where it is physically located Fall Semester 2006 EEC-681: Distributed Computing Systems 15

16 Migration Transparency • Migration transparency - Hide that a resource may move to another location • Closely related to location transparency and enabled by using logical names for resources • Example: WWW – A domain name can be mapped to different IP addresses – The same IP address can be reassigned to a server located in a different place Fall Semester 2006 EEC-681: Distributed Computing Systems 16

17 Relocation Transparency • Relocation transparency - Hide that a resource may be moved to another location while in use • Example – A mobile user can keep using his/her laptop without losing the Internet connection while moving across different cells Fall Semester 2006 EEC-681: Distributed Computing Systems 17

18 Replication Transparency • Replication transparency - Hide that a resource is replicated – More than one copy is available – All replica should have the same visible name Fall Semester 2006 EEC-681: Distributed Computing Systems 18

19 Concurrency Transparency • Concurrency transparency - Hide that a resource may be shared by several competitive users – Easy to guarantee if accesses to the same resource are all read-only – Care must be taken to maintain consistence if some accesses are updates Fall Semester 2006 EEC-681: Distributed Computing Systems 19

20 Failure Transparency • Failure Transparency - Hide the failure and recovery of a resource – Can be achieved through replication – But, very challenging and costly in general Fall Semester 2006 EEC-681: Distributed Computing Systems 20

21 Persistency Transparency • Persistency Transparency - Hide whether a (software) resource is in memory or on disk Fall Semester 2006 EEC-681: Distributed Computing Systems 21

22 Degree of Transparency • Observation: Aiming at full distribution transparency may be too much • Sometime distribution is apparent and not something you want to hide – E. g. , users may be located in different continents Fall Semester 2006 EEC-681: Distributed Computing Systems 22

23 Degree of Transparency • Completely hiding failures of networks and nodes is (theoretically and practically) impossible – You cannot distinguish a slow computer from a failing one – You can never be sure that a server actually performed an operation before a crash Fall Semester 2006 EEC-681: Distributed Computing Systems 23

24 Degree of Transparency • Full transparency will cost performance, exposing distribution of the system – Keeping Web caches exactly up-to-date with the master copy – Immediately flushing write operations to disk for fault tolerance Fall Semester 2006 EEC-681: Distributed Computing Systems 24

25 Openness of Distributed Systems • Open distributed system: Be able to interact with services from other open systems, irrespective of the underlying environment – Conform to well-defined interfaces – Supportability of applications – Easily interoperate Fall Semester 2006 EEC-681: Distributed Computing Systems 25

26 Openness of Distributed Systems • Achieving openness: At least make the distributed system independent from heterogeneity of the underlying environment – Hardware – Platforms – Languages Fall Semester 2006 EEC-681: Distributed Computing Systems 26

27 Implementation Openness • Requires support for different policies specified by applications and users – What level of consistency do we require for client cached data? – Which operations do we allow downloaded code to perform? – What level of secrecy do we require for communication? Fall Semester 2006 EEC-681: Distributed Computing Systems 27

28 Implementation Openness • Ideally, a distributed system provides only mechanisms: – Allow (dynamic) setting of caching policies, preferably per cacheable item – Support different levels of trust for mobile code – Offer different encryption algorithms Fall Semester 2006 EEC-681: Distributed Computing Systems 28

29 Mechanisms and Policies • Mechanisms determine how to do something while policies decide what should be done • The separation of policy from mechanism allows maximum flexibility in choosing policies and if policy decisions are to be changed later Fall Semester 2006 EEC-681: Distributed Computing Systems 29

30 Example: Managing a Queue • Let’s use an abstract priority queue as example • We need to support mechanisms for: – Insert/Delete items at start – Insert/Delete items at end – Find out the length of queue Fall Semester 2006 EEC-681: Distributed Computing Systems 30

31 Example: Managing a Queue • The queue can be implemented in different ways • Policies can be for example FIFO, LIFO • Policies should be decided by queue user Fall Semester 2006 EEC-681: Distributed Computing Systems 31

32 Scale in Distributed Systems • Scalability can be measured at three dimensions: – Size scalability – We can easily add more users and resources to the system – Geographical scalability – users and resources may lie far apart geographically – Administrative scalability – The system can still be easy to manage even if it spans many independent administrative organizations Fall Semester 2006 EEC-681: Distributed Computing Systems 32

33 Scale in Distributed Systems • Scalability problems in distributed systems appear as performance problems caused by limited capacity of servers and network Fall Semester 2006 EEC-681: Distributed Computing Systems 33

34 Size Scalability • Thomas J. Watson, Chairman of IBM, 1943: – “I think there is a world market for maybe five computers” • Internet: – July 1993: 1, 776, 000 computers – July 1999: 56, 218, 000 computers – January 2002: 168, 000 computers and > 23, 000 DNS domains Fall Semester 2006 EEC-681: Distributed Computing Systems 34

35 Size Scalability Problems Centralization is not good for size scalability Concept Example Centralized services A single server for all users A single on-line telephone Centralized data book Centralized Doing routing based on algorithms complete information Fall Semester 2006 EEC-681: Distributed Computing Systems 35

36 Size Scalability Problems • Problem running centralized algorithms in distributed systems – Would result in enormous number of messages have to be routed over many lines • Any algorithm that operates by collecting information from all sites, sends it to a single machine for processing, and then distributes the results must be avoided Fall Semester 2006 EEC-681: Distributed Computing Systems 36

Decentralized Algorithm Characteristics 37 • No machine has complete information about the system state • Machines make decisions based only on local information • Failure of one machine does not ruin the algorithm • There is no implicit assumption that a global clock exists Fall Semester 2006 EEC-681: Distributed Computing Systems 37

38 Geographical Scalability Problems • Interprocess communication in WANs has much longer latency than that in LANs • Communication in WANs is inherently unreliable, and virtually always point-topoint • Centralized components would reduce geographical scalability, just as does to size scalability Fall Semester 2006 EEC-681: Distributed Computing Systems 38

39 Administration Scalability Problems • Different administrative domain usually impose different policies – E. g. , with respect to resource usage, management, and security Fall Semester 2006 EEC-681: Distributed Computing Systems 39