Distributed Systems 1 BINA RAMAMURTHY Copyright 2010 B
Distributed Systems 1 BINA RAMAMURTHY Copyright 2010 B. Ramamurthy 10/27/2021
Introduction 2 A distributed system is a network of autonomous computers cooperating to accomplish a task. Hardware and software components of a distributed system coordinate their activity by sharing resources such as data, computation, compute cycles, bandwidth and storage. Examples: Internet, intranet, grid and mobile computing systems. Copyright 2010 B. Ramamurthy 10/27/2021
Topics 3 Some fundamental terms Advances in Client-side technologies Communication network Middleware concept Server-side technology Application models (web, web 2. 0, web 3. 0) Copyright 2010 B. Ramamurthy 10/27/2021
Evolution of Internet Computing CSE 507 Introdu ction 2008 4 ? ? ? Automate (discovery) Discover (intelligence) Semantic discovery Transact Integrate deep web Parallel HPC web Interact Inform Publish scale time 10/27/2021
Beyond Search Engines: Enabling Information Technology and 5 Scientific Applications TV/Remote Simple Search (stateless) Financial: Build Portfolio Environment: Plan Forestation Medicine: plan treatment Wireless device Biotech: drug discovery Complex multi-organizational applications CSE 507 Introduction 2008 10/27/2021
Terminologies 6 Copyright 2010 B. Ramamurthy 10/27/2021
Protocol 7 Protocol is a set of rules that end points in a telecommunication system use when exchanging information. IP: Internet protocol defines an unreliable packet transfer protocol. TCP: Transmission Control Protocol builds on IP to define a reliable data delivery protocol. LDAP: Lightweight Directory Access Protocol builds on TCP to define a query-response protocol for querying the state of a remote database. HTTP: Hyper Text Transfer Protocol builds on TCP to facilitate hyper-text document exchange. Copyright 2010 B. Ramamurthy 10/27/2021
Service 8 Service is a network-enabled entity that provides a specific capability. Service = Protocol + Behavior A service definition permits many implementations. Examples: ability to move files, create processes, verify access rights An FTP server speaks File Transfer Protocol and supports remote read and write access to a collection of files. Copyright 2010 B. Ramamurthy 10/27/2021
API 9 Application Program Interface (API) defines a standard interface for invoking a specified set of functionality. Examples: The Generic Security Service (GSS) API defines standard functions for verifying identity of communicating parties, encrypting messages and so forth. Copyright 2010 B. Ramamurthy 10/27/2021
SDK 10 Software Development Kit (SDK) denotes a set of code designed to be linked with, and invoked from within, an application program to provide specified functionality. An SDK typically implements an API. Example: Different SDKs implement GSS-API using the Kerberos or PKI protocols, respectively. Copyright 2010 B. Ramamurthy 10/27/2021
Internet 11 Internet is a very large distributed system. Interconnection of a collection of heterogeneous networks of computers. Protocols: IP, TCP, HTTP Services: world wide web (www), file transfers (ftp), email, etc. Copyright 2010 B. Ramamurthy 10/27/2021
Advances in Client-side programming 12 Copyright 2010 B. Ramamurthy 10/27/2021
Client programming 13 Simple programs written as a single module Single entry point typically in a “main” function Procedural, functional and object-oriented Applications, applets (web-based) More recently the focus is on rich content: Rich Internet Application (RIA) Adobe Flash, Adobe Flex, Microsoft Silverlight, Ajax Model-View-Controller (MVC) model for design and deployment flexibility. Ex: java swing, struts. . Copyright 2010 B. Ramamurthy 10/27/2021
Client/Server 14 Server: refers to a process on a networked computer that accepts requests from other (local or remote) processes to perform a service and responds appropriately. Client: requesting process in the above is referred to as the client. Request and response are in the form of messages. Request/response model Client is said to invoke an operation on the server. Many distributed systems today are constructed out of interacting clients/servers. Issues: connectivity (speed and accessibility), addressing, naming Copyright 2010 B. Ramamurthy 10/27/2021
Advances in Networking 15 Copyright 2010 B. Ramamurthy 10/27/2021
Internetworking 16 Internet stack Standardization IPV 4, IPV 6: Internet protocol version 4, 6 Tremendous increase in network bandwidth: measured in bits per second From few kilobits per second (56 kb/s dial up lines to 1. 5 Mb/s T 1 to 100 Gb/s ethernet) What can you do with such fast delivery speed? Connect them up. . Networked application models Copyright 2010 B. Ramamurthy 10/27/2021
Communication Network Page 17 Application Network Protocol Stack A communication middleware framework isolates the application developers from the details of the network protocol. Copyright 2010 B. Ramamurthy 10/27/2021
Client/server Issues 18 Basic object-technology could not fulfill the promises such as reusability and interoperability fully in the context internet and enterprise level applications. Deployment was still a major problem and as a result portability and mobility were impaired. Copyright 2010 B. Ramamurthy 10/27/2021
Issues in networked systems Page 19 Heterogeneity in various aspects of a distributed systems Communication modes different Synchronous: RPC Asynchronous: P 2 P, Publish and subscribe Variations in products Many vendors: IBM, IONA, TIBCO, Apache, Adobe Additional runtime features: fault tolerance, load balancing, transaction handling, usage metering, auditing, . . Copyright 2010 B. Ramamurthy 10/27/2021
Advances in Middleware 20 Copyright 2010 B. Ramamurthy 10/27/2021
Communication Middleware Page 21 Application Middleware Network Protocol Stack Copyright 2010 B. Ramamurthy 10/27/2021
Remote procedure call (RPC) Page 22 Client Procedure Applicationcall Execute call Server Application RPC Stub code RPC stub code RPC library/runtime Network Protocol Stack RPC stubs and runtime enable location transparency, encapsulate RPC communication infrastructure and provide a procedure call interface. Copyright 2010 B. Ramamurthy 10/27/2021
Distributed Objects Page 23 Client invoke Applicationmethod Execute method Server Application Client proxies Server skeletons ORB Network Protocol Stack ORB (Object Request broker) enables client applications to remotely instantiate, locate, invoke methods and Delete server objects; Java RMI, Microsoft’s DCOM; CORBA is meant to be platform independent. Copyright 2010 B. Ramamurthy 10/27/2021
Middleware (Eg. CORBA, Grid) server client “desktop” middleware “network” Copyright 2010 B. Ramamurthy 24 10/27/2021
Technical Layers Page 25 Participant A Business Logic Middleware Mapping Technology Independent Interface Description Participant B Technology Independent Interface Description XML ORB Communication facilities Copyright 2010 B. Ramamurthy Participant C Core Assets Technology Independent Interface Description Technology Adapters CORBA GRID XML Web services Middleware Buses 10/27/2021
What is a grid? 26 Grid is a sophisticated framework that enables sharing of a variety of resources among distributed applications. Open standard Large scale operations Automatic Intelligent Spontaneous Interoperable Service-oriented Copyright 2010 B. Ramamurthy 10/27/2021
What is a grid? (A formal definition) 27 Grid specifies a standard architecture, infrastructure, protocols and application program interface (API) for building an open enterprise system. It can provide Middleware supporting network of systems to facilitate sharing, standardization and openness. Infrastructure and application model dealing with sharing of compute cycles, data, storage and other resources. A framework for high reliability, availability and security. Interoperation of batch-oriented and service-based architectures. Standard service level feature definitions and higher level concepts for inter and intra-business collaboration. Copyright 2010 B. Ramamurthy 10/27/2021
Beginnings of The Grid Beginnings of the grid in Search for Extra Terrestrial Intelligence (seti@home project) http: //planetary. org/html/UPDATES/seti/index. html The Wow signal http: //planetary. org/html/UPDATES/seti/SETI@h ome/wowsignal. html 10/27/2021 Copyright 2010 B. Ramamurthy 28
Message-oriented Middleware (MOM) Page 29 Made famous by IBM’s MQseries and TIBCO’s Rendezvous products. Based on messages and queues. A message contains a header and a payload. A queue can store and distribute messages. Publish/subscribe model of communication: A topic offers another model of communication between subscribers and publishers. MOM allows for loose coupling between message consumers and message producers enabling dynamic, reliable, flexible, high-performance systems to be built. Copyright 2010 B. Ramamurthy 10/27/2021
Server-side Advances 30 Copyright 2010 B. Ramamurthy 10/27/2021
Server-side advances: Two-tier applications 31 Presentation Logic Business Logic Copyright 2010 B. Ramamurthy Database Server 10/27/2021
Server-side: Three-tier Applications 32 Presentation Logic Business Logic Database Server Tremendous advances in DBMS: relational, query languages, Object-relational…ACID property transactional systems Copyright 2010 B. Ramamurthy 10/27/2021
Programming Model for Web-based applications 33 Web client Web Service Business Logic Web Container Logic container Web Application Copyright 2010 B. Ramamurthy Enterprise components Database Server 10/27/2021
Components and Application Servers 34 An application server mediates between a web server and backend systems. Request from a web client is passed onto an application server by the web server. Programmer productivity, cost-effective deployment, rapid time to market, seamless integration, application portability, scalability, security are some of the challenges that component technology tries to address head on. Enterprise Java Beans is Sun’s server component model that provides portability across application servers, and supports complex systems features such as transactions, security, etc. on behalf of the application components. EJB is a specification provided by Sun and many third party vendors have products compliant with this specification: BEA systems (bought out by oracle), IONA, IBM, Oracle, Sybase (bought out by IBM). Copyright 2010 B. Ramamurthy 10/27/2021
Application Programming Model for Threetier Applications 35 Business Logic Application Container Presentation Components Copyright 2010 B. Ramamurthy Container Enterprise Components Database Server 10/27/2021
Expectations of a Distributed System 36 • Access transparency: enables local and remote resources to be accessed using identical operations. • Location transparency: enables resources to be accessed without knowledge of their location. • Concurrency transparency: enables several processes to operate concurrently using shared resources without interference between them. • Replication transparency: enables multiple instances of resources to be used to increase reliability and performance without knowledge of the replicas by users or application programmers. • Failure transparency: enables the concealment of faults, allowing users and application programs to complete their tasks despite the failure of hardware or software components. • Mobility transparency: allows the movement of resources and clients within a system without affecting the operation of users or programs. • Performance transparency: allows the system to be reconfigured to improve performance as loads vary. “Scalability” • Expansion transparency: allows the system and applications to expand in scale without change to the system structure or the application algorithms. Copyright 2010 B. Ramamurthy 10/27/2021
Issues contd. 37 Heterogeneity of components Scalability : ability to perform well under increased loads and data sizes Failure handling Concurrency Transparency Reliability Interoperability Performance Openness Security and protection Copyright 2010 B. Ramamurthy 10/27/2021
Emerging Application Models 38 Copyright 2010 B. Ramamurthy 10/27/2021
Large scale systems 39 Most emerging distributed applications are very large demanding large amounts of storage and data resources E-commerce systems and online businesses Applications connecting communities: from search to social networking See world’s 10 largest data bases Amount of data collected by various sources from terrorism monitoring to environmental monitoring Data deluge: most of the data is write once read many (WORM) Analytics resulting in data-intensive computing models and big-data computing (CSE 487 material) We will look just one success story of distributed system: amazon. com Copyright 2010 B. Ramamurthy 10/27/2021
Amazon. com 40 Werner Vogels’ talk “Order in the Chaos: Building the Amazon. com Platform. " 1995: Started out with a single web service on a single server. Today amazon has about 150 web services on its homepage alone. 1 million merchant partners; 60 million customers One server of customers and inventory grew into two servers; more database servers were added as the business expanded 1999: A mistep during this exponential growth period was moving to mainframe from distributed server. Failed to meet scalability, reliability and performance; it was scratched in 2000. Copyright 2010 B. Ramamurthy 10/27/2021
Amazon (contd. ) 41 Robustness: Shopping cart is tested for 20000 items by a single customer, for example! Amazon’s secret sauce is “operating reliably at scale”. After “the denial of service” debacle in 1999, they decided to use Web services to insulate the databases from being overwhelmed by direct interaction with online applications. Each web service is the responsibility of a team of developers: “And they are not just responsible for writing the service and then tossing it over the wall for testing and eventual entry into production where some poor maintenance geek has to look after it. The Amazon CTO tells his Web services team members: "You build it. You own it. " That means the team is responsible for its Web service's on-going operation. If a Web service stops working in the middle of the night, team members are called to fix it. ” Web services are kept simple: complexity is the notorious enemy of reliability No attachment to one technology or standard: what ever customer wants, give it. Copyright 2010 B. Ramamurthy 10/27/2021
On to more fundamental concepts 42 Copyright 2010 B. Ramamurthy 10/27/2021
Synchrony 43 Synchronous and asynchronous communications Synchronous: immediate response of communicating partners Server process/thread blocks until response is completed Follows request/response pattern Used when servers are available all the time Typically communicating partners are tightly coupled Examples: request from web client to a web browser for “search” or for “information” CORBA procedure invocation Java RMI (remote method invocation) Traditional remote procedure call (RPC) Copyright 2010 B. Ramamurthy 10/27/2021
Asynchronous communication 44 Communicating partners are decoupled Message driven: sender creates a message and delivers it to a mediator who then sends it to “a” recipient Server need not be available all the time Sender and receiver loosely coupled Can facilitate high-performance message-based system Example: Any event-driven system Any messaging system (instant messenger) Publish-subscribe mode communications Copyright 2010 B. Ramamurthy 10/27/2021
Interface vs Payload Semantics 45 Typically interaction between a client and a server results in the execution of an activity (ot transaction) Request needs to be specified by the request. Interface semantics: Requested activity can be encoded in the operation signature in the server’s “interface” or Payload semantics: It can be embedded in the message itself Copyright 2010 B. Ramamurthy 10/27/2021
Interface Semantics 46 Process 1 Process 2 get. Customer() retrieve. Customer. Data() return. Result() Semantics of the activity is explicitly stated in the message/method call Copyright 2010 B. Ramamurthy 10/27/2021
Payload Semantics 47 Envelop With message Process 1 Process 2 Requested transaction/activity is embedded in the message Details of the activity not explicit; the semantics are embedded in the message Copyright 2010 B. Ramamurthy 10/27/2021
Payload Semantics Page 48 on. Message() Copyright 2010 B. Ramamurthy 10/27/2021
Payload semantics is generic Page 49 String transfer. Money (amt: decimal, acc. To: String) { …} String execute. Service (message: String) { …} Copyright 2010 B. Ramamurthy 10/27/2021
Document-centric Messages Page 50 With emergence of self-descriptive data structures such as XML, document-centric has become popular Semantically rich messages where operation name, its parameters, return type are self descriptive. SOAP (Simple Object Access Protocol) over XML is an example Lets look at XML, SOAP, WS evolution. WS SOA Copyright 2010 B. Ramamurthy 10/27/2021
Tight vs. Loose Coupling 51 An important characteristics of an SOA that is a loosely coupled system. On the technology front this is driven by dynamic discovery and binding enabled by Universal Description, Discovery and Integration (UDDI) On the business front loose coupling addresses the growing need for companies to be flexible and agile with respect changes in their own processes and those of their partners How does loose coupling help in improving agility, flexibility and performance? Copyright 2010 B. Ramamurthy 10/27/2021
Tight vs. Loose coupling Level Tight coupling Loose coupling Physical coupling Direct physical link required Physical intermediary Communication style synchronous asynchronous Type system Strongly typed (interface semantics) Weak type system (payload semantics) Interaction pattern OO-style navigation of complex object trees Data-centric, self-contained messages Control of process logic Central control of process logic Distributed logic components Service discovery and binding Statically bound services Dynamically bound services Platform dependencies Strong OS and programming language dependencies OS- and programming language dependent Page 52 Copyright 2010 B. Ramamurthy 10/27/2021
Challenges CSE 507 Introdu ction 2008 Need transformative solutions such as the Internet and the Search Alignment with the needs of the business / user / noncomputer specialists / community and society Need to address the scalability issue: large scale data, high performance computing, automation, response time, rapid prototyping, and rapid time to production Need to effectively address (i) ever shortening cycle of obsolescence, (ii) heterogeneity and (iii) rapid changes in requirements Transform data from diverse sources into intelligence and deliver intelligence to right people/user/systems 53 10/27/2021
Tools to explore 54 Design tool: Rational rose demo model Windows http: //www. cse. buffalo. edu/bina/rosecppdemo. exe Language for your distributed system development: Java IDE for development needs of your projects: Netbeans? Eclipse? Copyright 2010 B. Ramamurthy 10/27/2021
Summary 55 We discussed the fundamental choices available to a designer in assembling a distributed system A designer must choose appropriate communication infrastructure, synchrony, call semantics, use of intermediary, objectoriented versus data-centric interfaces. In this course, we will study different types of distributed systems and learn to design, develop and implement distributed systems. Copyright 2010 B. Ramamurthy 10/27/2021
- Slides: 55