CS 230 Distributed Systems Lecture 1 Introduction to
CS 230 - Distributed Systems Lecture 1 - Introduction to Distributed Systems Tuesdays, Thursdays 3: 30 -4: 50 p. m. Prof. Nalini Venkatasubramanian nalini@ics. uci. edu Distributed Systems 1
Course logistics and details z Course Web page y http: //www. ics. uci. edu/~cs 230 z Lectures - Tu. Th 3: 30 -4: 50 p. m z Must Read: Course Reading List x. Collection of Technical papers and reports by topic z Reference Books x. Distributed Systems: Concepts & Design, 4 th ed. by Coulouris et al. ISBN: 0 -321 -26354 -5. (preferred) x. Distributed Systems: Principles and Paradigms, 2 nd ed. by Tanenbaum & van Steen. ISBN: 0 -132 -39227 -5. x. Distributed Computing: Principles, Algorithms, and Systems, 1 st ed. by Kshemkalyani & Singhal. ISBN: 0 -521 -87634 -6 Distributed Systems 2
Prerequisite Knowledge z Necessary – Operating Systems Concepts and Principles, basic computer system architecture z Highly Desirable – Understanding of Computer Networks, Network Protocols z Necessary – Basic programming skills in Java, C++, … Distributed Systems 3
Course logistics and details z Homeworks y. Paper summaries z Midterm Examination z Course Project y. Maybe done individually or in groups x. Project proposal due end of Week 2 x. Survey of related research due end of Week 6 x. Final Project presentations/demos/reports – Finals week y. Potential projects will be available on webpage Distributed Systems 4
Comp. Sci 230 Grading Policy z Homeworks - 30% of final grade x 1 paper summary due every week after Week 2 covering topics discussed the previous week. z Midterm - 30% of final grade x. Tentatively in Week 7 z Class Project - 40% of the final grade z Final assignment of grades will be based on a curve. Distributed Systems 5
Lecture Schedule y Weeks 1, 2, 3: Distributed Systems Fundamentals x. Introduction – Needs/Paradigms • Basic Concepts and Terminology, Concurrency x. Time and State in Distributed Systems • Physical and Logical Clocks • Distributed Snapshots, Termination Detection, Consensus y Week 4, 5, 6: Distributed OS and Middleware Issues x. Interprocess Communication • Remote Procedure Calls, Distributed Shared Memory x. Distributed Process Coordination/Synchronization • Distributed Mutual Exclusion/Deadlocks, Leader Election x. Distributed Process and Resource Management • Task Migration, Load Balancing x. Distributed I/O and Storage Subsystems • Distributed File. Systems Distributed Systems 6
Lecture Schedule y. Weeks 7, 8: Messaging and Communication in Distributed Systems x. Naming in Distributed Systems x. Gossip, Tree, Mesh Protocols x. Group Communication y. Weeks 9, 10: Non-functional “ilities” in distributed systems x Reliability and Fault Tolerance x Quality of Service and Real-time Needs y. Sample Distributed Systems (time permitting) x P 2 P, Grid and Cloud Computing, Mobile/Pervasive Distributed Systems 7
What is not covered z Security in Distributed Systems (Prof. Tsudik’s course) z Distributed Database Management and Transaction Processing (CS 223) z Distributed Objects and Middleware Platforms (CS 237) Distributed Systems 8
Introduction z Distributed Systems y. Multiple independent computers that appear as one y. Lamport’s Definition x“ You know you have one when the crash of a computer you have never heard of stops you from getting any work done. ” y“A number of interconnected autonomous computers that provide services to meet the information processing needs of modern enterprises. ” Distributed Systems 9
Next Generation Information Infrastructure Device. Nets & Sensor. Nets Electronic Commerce Distance Learning Wide Area Network (Internet) Visualization Battle Planning Visualization Collaborative Multimedia (Telemedicine) Collaborative Task Clients Server farms Requirements - Availability, Reliability, Quality-of-Service, Cost-effectiveness, Security Distributed Systems 10
Characterizing Distributed Systems z Multiple Autonomous Computers y each consisting of CPU’s, local memory, stable storage, I/O paths connecting to the environment y Geographically Distributed z Interconnections y some I/O paths interconnect computers that talk to each other z Shared State y No shared memory y systems cooperate to maintain shared state y maintaining global invariants requires correct and coordinated operation of multiple computers. Distributed Systems 11
Examples of Distributed Systems z Transactional applications - Banking systems z Manufacturing and process control z Inventory systems z General purpose (university, office automation) z Communication – email, IM, Vo. IP, social networks z Distributed information systems y WWW y Cloud Computing Infrastructures y Federated and Distributed Databases Distributed Systems 12
Distributed Systems 13
A Distributed Cyber. Physical Space – UCI Responsphere Campus-wide infrastructure to instrument, experiments, monitor, disaster drills & to validate technologies sensing, communicating, storage & computing infrastructure Software for real-time collection, analysis, and processing of sensor information used to create real time information awareness & post-drill analysis Distributed Systems 14 14
Why Distributed Computing? z Inherent distribution y. Bridge customers, suppliers, and companies at different sites. z Speedup - improved performance z Fault tolerance z Resource Sharing y. Exploitation of special hardware z Scalability z Flexibility Distributed Systems 15
Peer to Peer Systems P 2 P File Sharing Napster, Gnutella, Kazaa, e. Donkey, Bit. Torrent Chord, CAN, Pastry/Tapestry, Kademlia P 2 P Communications MSN, Skype, Social Networking Apps P 2 P Distributed Computing Seti@home Use the vast resources of machines at the edge of the Internet to build a network that allows resource sharing without any central authority. Distributed Systems 16
Why are Distributed Systems Hard? z Scale ynumeric, geographic, administrative z Loss of control over parts of the system z Unreliability of message passing yunreliable communication, insecure communication, costly communication z Failure y. Parts of the system are down or inaccessible y. Independent failure is desirable Distributed Systems 17
Design goals of a distributed system z Sharing y. HW, SW, services, applications z Openness(extensibility) yuse of standard interfaces, advertise services, microkernels z Concurrency ycompete vs. cooperate z Scalability yavoids centralization z Fault tolerance/availability z Transparency ylocation, migration, replication, failure, concurrency Distributed Systems 18
Classifying Distributed Systems z Based on degree of synchrony y. Synchronous y. Asynchronous z Based on communication medium y. Message Passing y. Shared Memory z Fault model y. Crash failures y. Byzantine failures Distributed Systems 19
Computation in distributed systems z Asynchronous system y no assumptions about process execution speeds and message delivery delays z Synchronous system y make assumptions about relative speeds of processes and delays associated with communication channels y constrains implementation of processes and communication z Models of concurrency y Communicating processes y Functions, Logical clauses y Passive Objects y Active objects, Agents Distributed Systems 20
Concurrency issues z Consider the requirements of transaction based systems y. Atomicity - either all effects take place or none y. Consistency - correctness of data y. Isolated - as if there were one serial database y. Durable - effects are not lost z General correctness of distributed computation y. Safety y. Liveness Distributed Systems 21
Communication in Distributed Systems z Provide support for entities to communicate among themselves y. Centralized (traditional) OS’s - local communication support y. Distributed systems - communication across machine boundaries (WAN, LAN). z 2 paradigms y. Message Passing x. Processes communicate by sharing messages y. Distributed Shared Memory (DSM) x. Communication through a virtual shared memory. Distributed Systems 22
Message Passing z Basic communication primitives y Send message y Receive message z Modes of communication y Synchronous xatomic action requiring the participation of the sender and receiver. x. Blocking send: blocks until message is transmitted out of the system send queue x. Blocking receive: blocks until message arrives in receive queue y Asynchronous x. Non-blocking send: sending process continues after message is sent x. Blocking or non-blocking receive: Blocking receive implemented by timeout or threads. Non-blocking receive proceeds while waiting for message. Message is queued(BUFFERED) upon arrival. Distributed Systems 23
Reliability issues z Unreliable communication y. Best effort, No ACK’s or retransmissions y. Application programmer designs own reliability mechanism z Reliable communication y. Different degrees of reliability y. Processes have some guarantee that messages will be delivered. y. Reliability mechanisms - ACKs, NACKs. Distributed Systems 24
Remote Procedure Call z Builds on message passing y extend traditional procedure call to perform transfer of control and data across network y Easy to use - fits well with the client/server model. y Helps programmer focus on the application instead of the communication protocol. y Server is a collection of exported procedures on some shared resource y Variety of RPC semantics x“maybe call” x“at least once call” x“at most once call” Distributed Systems 25
Distributed Shared Memory z Communication Abstraction used for processes on machines that do not share memory y Motivated by shared memory multiprocessors that do share memory CPU 2 Memory CPU 1 Memory CPU 3 CPU 4 Distributed Systems 26
Distributed Shared Memory z Processes read and write from virtual shared memory. y Primitives - read and write y OS ensures that all processes see all updates z Caching on local node for efficiency y Issue - cache consistency CPU CPU Cache Memory CPU Cache Distributed Systems CPU Cache Memory 27
Fault Models in Distributed Systems z Crash failures y. A processor experiences a crash failure when it ceases to operate at some point without any warning. Failure may not be detectable by other processors. x. Failstop - processor fails by halting; detectable by other processors. z Byzantine failures ycompletely unconstrained failures yconservative, worst-case assumption for behavior of hardware and software ycovers the possibility of intelligent (human) intrusion. Distributed Systems 28
Other Fault Models in Distributed Systems z Dealing with message loss y. Crash + Link x. Processor fails by halting. Link fails by losing messages but does not delay, duplicate or corrupt messages. y. Receive Omission xprocessor receives only a subset of messages sent to it. y. Send Omission xprocessor fails by transmitting only a subset of the messages it actually attempts to send. y. General Omission x. Receive and/or send omission Distributed Systems 29
Other Distributed System issues z Concurrency and Synchronization z Distributed Deadlocks z Time in distributed systems z Naming z Replication yimprove availability and performance z Migration yof processes and data z Security yeavesdropping, masquerading, message tampering, replaying Distributed Systems 30
Client/Server Computing z Client/server computing allocates application processing between the client and server processes. z A typical application has three basic components: y. Presentation logic y. Application logic y. Data management logic Distributed Systems 31
Client/Server Models z There at least three different models for distributing these functions: y. Presentation logic module running on the client system and the other two modules running on one or more servers. y. Presentation logic and application logic modules running on the client system and the data management logic module running on one or more servers. y. Presentation logic and a part of application logic module running on the client system and the other part(s) of the application logic module and data management module running on one or more servers Distributed Systems 32
Distributed Computing Environment (DCE) z DCE is from the Open Software Foundation (OSF), and now X/Open, offers an environment that spans multiple architectures, protocols, and operating systems. y. DCE supported by major software vendors. z It provides key distributed technologies, including RPC, a distributed naming service, time synchronization service, a distributed file system, a network security service, and a threads package. Distributed Systems 33
DCE Distributed File Service DCE Security DCE Other Basic Service Distributed Directory Services Time Service Management Applications DCE Remote Procedure Calls DCE Threads Services Operating System Transport Services Distributed Systems 34
Distributed Systems Middleware z Middleware is the software between the application programs and the operating System and base networking z Integration Fabric that knits together applications, devices, systems software, data z Middleware provides a comprehensive set of higher-level distributed computing capabilities and a set of interfaces to access the capabilities of the system. Distributed Systems 35
Distributed Systems Middleware y. Enables the modular interconnection of distributed software xabstract over low level mechanisms used to implement resource management services. y. Computational Model x. Support separation of concerns and reuse of services y. Customizable, Composable Middleware Frameworks x. Provide for dynamic network and system customizations, dynamic invocation/revocation/installation of services. x. Concurrent execution of multiple distributed systems policies. Distributed Systems 36
The Evergrowing Middleware Alphabet Soup Distributed Computing Environment (DCE) Orbix IOP IIOP GIOP WSDL WS-BPEL WSIL Java Transaction API (JTA) JNDI LDAP JMS BPEL BEA Tuxedo® Object Request Broker (ORB) EAI RTCORBA SOAP Message Queuing (MSMQ) Distributed Component XQuery Object Model (DCOM) opal. ORB XPath Remote Method Invocation TM INI ORBlite Encina/9000 (RMI) Rendezvous Enterprise BEA Web. Logic® Java. Beans Remote Procedure Call Technology (RPC) (EJB) Extensible Markup Language (XML) ZEN IDL J Distributed Systems Borland® Visi. Broker® 37
Distributed Object Computing z Combining distributed computing with an object model. y. Allows software reusability and a more abstract level of programming y. The use of a broker like entity or bus that keeps track of processes, provides messaging between processes and other higher level services y. Examples x. CORBA x. JINI, EJB, J 2 EE x. E-SPEAK x. Note: DCE uses a procedure-oriented distributed Distributed Systems systems model, not an object model. 38
- Slides: 38