Introduction Distributed Systems IT 332 2 Outline Definition

Introduction Distributed Systems IT 332

2 Outline Definition of A Distributed System Goals of Distributed Systems Types of Distributed Systems

3 Definition of A Distributed System A distributed systems is a collection of independent computers that appears to its users as a single coherent system. Two aspects: (1) hardware ‐ Independent computers (2) software – users think they are dealing with a single system

4 Goals of Distributed Systems Making Resources Accessible Problems of sharing? security Transparency To hide the fact that its processes and resources are physically distributed across multiple computers Openness To offer services according to standard rules that describe the syntax and semantics of those services Scalability

5 Transparency Description Access Hide differences in data representation and how a resource is accessed Location Hide where a resource is located Migration Hide that a resource may move to another location Relocation Hide that a resource may be moved to another location while in use Replication Hide that a resource is replicated Concurrency Hide that a resource may be shared by several competitive users Failure Hide the failure and recovery of a resource • Aiming at full distribution transparency is not always good idea and difficult to achieve • A Trade off between a high degree of transparency and the performance of the system

6 Openness An open distributed system is a system that is able to interact with services from other systems irrespective of the underlying environment Systems should conform to well‐defined interfaces Systems should supportability of applications Systems should easily interoperate Systems should easily extensible

7 Scalability A distributed system is scalable if it will remain effective when the number of resources and users is significantly increased At least three aspects Number of users and/or processes (size scalability) Maximum distance between nodes (geographical scalability) Number of administrative domains (administrative scalability) Most systems account only, to a certain extent, for size scalability by using powerful servers. Today, the challenge lies in geographical and administrative scalability.

8 Scalability Techniques Distribution: Partition data and computations across multiple machines: Move computations to clients (Java applets) Decentralized naming services (DNS) Replication: Make copies of data available at different machines: Replicated file servers (mainly for fault tolerance) Replicated databases Mirrored Web sites Caching: Allow client processes to access local copies: Web caches (browser/Web proxy) File caching (at server and client)

9 Scaling Problem Having multiple copies (cached or replicated), leads to inconsistencies modifying one copy makes that copy different from the rest. keeping copies consistent and in a general way requires global synchronization on each modification. Global synchronization makes large‐scale solutions practically impossible. If we can tolerate inconsistencies, we may reduce the need for global synchronization.

10 Pitfalls When Developing Distributed Systems False assumptions made by first time developer: The network is reliable. The network is secure. The network is homogeneous. The topology does not change. Latency is zero. Bandwidth is infinite. Transport cost is zero. There is one administrator.

11 Types of Distributed Systems Distributed Computing Systems Distributed Information Systems Distributed Pervasive Systems

12 Distributed Computing Systems Used for high‐performance computing tasks Cluster computing systems A collection of similar workstations or PCs connected by a high‐ speed local‐area network (LAN) It is homogenous given that each node runs the same OS. Used for parallel programming. Grid computing systems A collection of machines from different organizations are connected over a wide‐area network to allow the collaboration of a group of people or institutions in virtual manner. It has a high degree of heterogeneity given that each machine may be in a different administrative domain, and may have different hardware, OS, and network technology. Support virtual organizations (Vos) A Vos defines a group of users/applications that have access to a specified group of resources, which may be distributed across many different computers, owned by many different organizations.

13 Cluster Computing An example of a cluster computing system.

14 Grid Computing Layered architecture for grid computing systems

15 Distributed Information Systems Used to integrate networked applications in an organization Transaction processing systems Support distributed transactions A transaction contains operations such that either all of the operations are executed or none are executed A transaction have ACID properties. A distributed transaction is a transaction that accesses objects managed by multiple servers Nested transactions are important in DS, they provide a natural way of distributing a transaction across multiple machines. Sub transactions are run in parallel.

16 Transaction Processing Systems A nested transaction

17 Transaction Processing Systems A transaction processing (TP) monitor allows an application to access multiple servers/database

18 Enterprise Application Integration Enterprise application integration Let applications communicate directly with each other not merely by means of the request/reply behavior supported by TPS. Types of communication middleware: remote procedure call (RPC), remote method invocation (RMI), message‐ oriented middleware (MOM)

19 Enterprise Application Integration With Remote Procedure Calls (RPC), an application component can send a request to another application component by doing a local procedure call. The result will be sent back in the same manner. An RMI is essentially the same as an RPC, except that it operates on objects instead of applications. RPC and RMI have the disadvantage that the caller and callee are coupled, as they need to : Both be up. Run at the time of communication. To know how to refer to each other. This coupling led to Message‐Oriented Middleware MOM: applications simply send messages to logical contact points, described a subject. Likewise, when applications are interested for a specific type of message, the MOM will take care that those messages are delivered to those applications.

20 Enterprise Application Integration Middleware as a communication facilitator in enterprise application integration.

21 Distributed Pervasive Systems Part of our surroundings Nodes are often small, battery powered, mobile devices with only a wireless connection Laptops, smart phones, digital cameras, etc. Self‐managing: not managed through a system administrator and no human administrative control. Examples: smart homes, electronic health care systems, sensor networks

22 Health Care Systems New devices are being developed to monitor the well‐being of individuals and to automatically contact physicians when needed. Personal health care systems are often equipped with various sensors organized in a (preferably wireless) body‐area network (BAN). BAN should be able to operate while a person is moving, with no strings (i. e. , wires) attached to immobile devices. HCS is essentially a sensor networks system.

23 Health Care Systems (HCS) (a) (b) Monitoring a person in a pervasive electronic HCS, using (a) a local hub or (b) a continuous wireless connection.

24 Next Chapter Architecture Questions? !