Distributed Systems 1 Introduction to Distributed Systems Simon

Distributed Systems 1. Introduction to Distributed Systems Simon Razniewski Faculty of Computer Science Free University of Bozen-Bolzano A. Y. 2014/2015

Outline 1. 2. 3. 4. Introduction History Architecture Paradigms

What is a Distributed System? • If you ask three computer scientists, you will get four different opinions

What is a Distributed System? • Examples?

Distributed System Definition: A collection of independent computers that appears to its users as a single coherent system • Several computers – Independence – Heterogeneity of architectures • Provides service • Transparency (users perceive a single system) – Collaboration! – Realizing collaboration mechanisms is the goal of distributed systems development – Interaction is hidden – Access to remote resources as if they were local

Alternative Definition • A distributed system is one where a machine I've never heard of can cause my program to fail Leslie Lamport • Abstraction: From the outside, a black box providing service interfaces

Requirements 1. 2. 3. 4. Make resources accessible Transparency Openness Scalability

Requirement 1 Make Resources Accessible • Resource: source or supply from which a benefit is produced – printers, computers, files, pages, networks, project deliverables, notes, … • • Economic implications Information exchange, knowledge sharing Distributed work, virtual organizations Issue: security – Balance between sharing and privacy – Difficult to be achieved in “open” networks

Requirement 2 Distribution Transparency Distributed system should present itself to users and applications as a single coherent system Transparency Description Access Hide differences in data representation and how a resource is accessed (multiple interacting OSs) Location Hide where a resource is located (naming strategies) Migration Hide that a resource may move to another location Relocation Migration while resource is in use Replication Hide that a resource is replicated (requires location transparency) Concurrency Hide that a resource may be shared by several competitive users Failure Hide the failure and recovery of a resource Persistence Hide whether a (software) resource is in memory or on disk

Requirement 2 Distribution Transparency Distributed system should present itself to users and applications as a single coherent system Sometimes transparency is misleading • Performance – Computation/interaction timings (e. g. Skype) – Consistency issues (e. g. , Drop. Box) • Location-awareness (embedded/ubiquitous systems) • Comprehensibility – Network awareness clarifies “strange behaviors” Properly tune the degree of transparency

Requirement 3 Openness Open DS: offers services according to standard rules describing their syntax and semantics • Interoperability: extent by which two systems/components can work together by just knowing each other’s interfaces • Portability: to what extent an application can be seamlessly executed in a different DS that implements the same interfaces • Reconfiguration/Extensibility: to what extent the DS can be re-organized or accommodate new-parts Services rely on interaction protocols

Standardization • Importance of service interfaces – Completeness: provide all relevant information – Neutrality: independent from implementations – Semantics? ? ? • Importance of interaction protocols and standards – Well-disciplined ways of exchanging messages • Standardization efforts: – – – ISO = International Standards Organisation ITU-T = International Telecommunication Union IETF = Internet Engineering Task force IEEE = Institute of Electrical and Electronic Engineers W 3 C = World Wide Web Consortium

Importance of Standards • Guarantee large-scale interoperation • Guarantee heterogeneity of products • Network standards: regulate interaction between remote hardware/software entities, connected through a network • Two types of standard – De jure (top-down): international committees and organizations – De facto (bottom-up): emerge from practice (UNIX, JAVA, TCP/IP, …)

Openness Guidelines • Guidelines – High componentization • Programming in the small vs programming in the large • Small replaceable/adaptable components with clear interfaces • Logical AND implementation separation – Components: must separate policy from mechanism • Hide mechanism • Offer policy

Requirement 4 Scalability Scalable DS: operates effectively and efficiently independently from the number of resources and users • Dimensions of scalability (Neumann, 1994) – Size: w. r. t. number resources/users – Geography: w. r. t. location of resources/users – Administration: w. r. t. involved administrative silos

Scalability Problems (Bottlenecks) Concept Example A single server for all users Centralized services (legacy systems, security reasons) A single database Centralized data (avoiding synchronization issues) Doing routing based on complete Centralized algorithms information Decentralized algorithms 1. No machine holds the complete system state 2. Machines take decisions only on local info 3. The algorithm is robust to machine failure

Scalability Problems (Geography) • From LANs to WANs… – LAN: short distances – WAN: large distances LAN WAN Synchronous communication (wait for answer) YES NO Broadcasting (send to all) YES Impossible MAYBE NO Centralized solution

Scalability Problems (Admin. ) • Combination of two administrative domains… • Non-technical issues – Politics, human relationships… • Conflicting policies – Payment, management, security • Lack of mutual trust – Internal code: extensively tested/used – External code: unknown • Security issues – Protection from malicious attacks coming from • The new users that have access to the system • The foreign code that is now part of the system

Scaling Techniques 1/2 • Asynchronous communication • Reduction of the overall communication

Scaling Techniques 2/2 • Distribution of (sub)components and data • Replication availability, load balancing – Caching (client’s responsibility) – Consistency and synchronization issues • Increase of computational resources – Temporary – Works only for the size-related aspects

False Myths • • The network is reliable The network is secure The network is homogeneous The topology does not change Latency is zero Bandwidth is infinite Transport cost is zero There is one administrator

Big Issues in Distributed Systems • Concurrency – Multiple autonomous processes running in parallel – Cooperation (resource sharing) – Competitiveness (shared access) • No global clock – Every interacting process has its own local time – No synchronization • Independent failures – Machine crash – Network problems • Security

How about our examples? 1. Make resources accessible 2. Transparency – – – – Access Location Migration Relocation Replication Concurrency Failure 3. Openness 4. Scalability

History

Computers 1945 -1985 • • Large-sized, expensive computers Centralized computing (single-processor systems) Monolithic programming OS: monolithic kernel

Computers around 1985 • Powerful micro-processors – Moore’s law: the number of transistors that can be placed over an integrated circuit doubles every 2 years • Computer (local-area) networks – Connection among several machines in a building – Small amount of data quickly transferable (μs) • Micro-kernels – Functionalities moved in user space – LAN functionalities distributed across the network!

Computers 1985 -now • High-speed LANs – 100 Mbps – 10 Gbps • Wide area networks – Connection at the earth level – 64 Kbps – 1 Gbps • Internet – Network of networks – Billions of users

Centralized vs. Distributed Computing

The Internet • A network of networks built upon TCP/IP • A basis for distributed systems (www, name services, Vo. IP, file sharing, …)

How Many People?

How Big is the Internet? “Eric Schmidt, the CEO of Google, the world’s largest index of the Internet, estimated the size at roughly 5 million terabytes of data. That’s over 5 billion gigabytes of data, or 5 trillion megabytes. Schmidt further noted that in its seven years of operations, Google has indexed roughly 200 terabytes of that, or. 004% of the total size. ” http: //www. wisegeek. com/how-big-is-the-internet. htm

Network Applications Finance and commerce e. Commerce e. g. Amazon and e. Bay, Pay. Pal, online banking and trading , Bitcoin The information society Web information and search engines, ebooks, Wikipedia; social networking: Facebook, Google+. Creative industries and online gaming, music and film at home, user-generated entertainment content, e. g. You. Tube, Flickr Healthcare health informatics, online patient records, monitoring patients Education e-learning, virtual learning environments; distance learning Transport and logistics GPS in route finding systems, map services: Google Maps, Google Earth Science Access to research papers and data sets, collaborative working (SVN, Google Docs, Overleaf), Mapreduce, SETI@home Environmental management sensor technology to monitor earthquakes, floods or tsunamis

Architecture

Distributed Systems vs Computer Networks • Distributed System: transparency/coherence – The middleware manages interaction (black box) – Applications often run on top of a computer network – E. g. WWW, multiplayer online games, … • Computer network: collection of autonomous computers interconnected by a single technology – Interconnection = ability of exchanging information • Usually via message passing – Users exposed to the actual machines (white box) – Applications support users in interconnecting machines – E. g. remote desktop, file transfer, remote printers, … • To realize distributed systems we need to understand/program computer networks! – Middleware development

Middleware Interconnection managed by middleware • Offers each application the same interface (paradigm) • Usually computers are distributed across a network

Network Types • Scale – PAN – Personal Area Network: one person – LAN – Local Area Network: private network, building/campus – MAN – Metropolitan Area Network: city – WAN – Wide Area Network: country/continent • Transmission technology – Broadcast links: single communication channel shared by all machines – Point-to-point links: connections between individual pairs of machines ( routing)

Network Scales

Paradigms for Distributed Systems • Paradigm: a recurrent pattern • Key features of DSs – Interprocess communication: A distributed application require the participation of two or more independent entities (processes). To do so, the processes must have the ability to exchange data among themselves. • Collaboration – Event synchronization: In a distributed application, the sending and receiving of data among the participants of a distributed application must be synchronized.

Interaction Protocol • Relies on message exchange needs message passing architecture – A process sends a message (request) – The message is delivered to a receiver – The receiver processes the request – The receiver sends a message back (response, reply) – The reply may trigger a further request • Message sequence charts usually employed to depict interaction

Message Passing: Features • Primitives – Basic • Send (a message) • Receive (wait for a message) – For connection-oriented approach (see later) • Connect • Disconnect • Abstraction similar to I/O: the primitives encapsulate the network communication • Example: Java Socket APIs

Rendezvous Problem

Rendezvous: Intuitive Formulation • Having never been to London before, two friends agree to meet at the entrance to Green Park. • Upon arrival, they both realize they've made an elementary mistake - the park has several entrances, and neither has a clue which one the other will be waiting at. • Each friend has the binary choice of either staying put in the hope that the other will come and find them, or heading off in search of the other. • If both end up waiting for the other they will never meet, while if both leave their starting positions in search for the othere is only a limited chance that their paths will coincide.

Solution: Client-Server • Server: waits passively for the arrival of requests. • Client: issues specific requests to the server and awaits response • Synchronous interaction model with blocking semantics for the client • These roles trivially solve (a-priori) the rendezvous problem: – Passive server: always waiting for new messages – Proactive client: raise a request to the server

Details: Client-Server • Assigns asymmetric roles to two interacting processes – Possibly running on different computers (not mandatory) – Client: requires a certain service to be accomplished (but has not the necessary capability) – Server: provides a certain service to (multiple) clients • Many C-S dynamic relationships! • E. g. : ATM – Customer is client of the machine – Machine is client of the bank information system – The bank information system becomes client of the customer’s other bank’s information system –…

Client-Server Pattern Client node request service <wait response> … … receive response … Server node request response (result) Communication Infrastructure <wait request> receive request <exec. service> … … return result Can be repeated to form a dialogue: sequence of request-response patterns

Session IPC examples The dialog in each session follows a pattern prescribed in the protocol specified for the service. Daytime service [RFC 867]: Client: Hello, <client address> here. May I have a timestamp please. Server: Here it is: (time stamp follows) World Wide Web session: Client: Hello, <client address> here. Server: Okay. I am a web server and speak protocol HTTP 1. 0. Client: Great, please get me the web page index. html at the root of your document tree. Server: Okay, here’s what’s in the page: (contents follows).

Locating the Service • A mechanism must be available to allow a client process to locate a server for a given service – “address” of the server process cabled a-priori in the code – logical name: to be mapped to the physical address of the server process • the mapping is usually performed at runtime service’s location to dynamic/moveable • See DNS next in the course – Registry service (SOA) • Will extensively talk about this later in the course….

Service-Oriented Architecture • See second part of the course

Client-Server: Features • • Asymmetric Synchronous Blocking One-to-many Client P Communication Infrastructure Server Client – One server P – Many clients – Each client engages in a separate dialogue with the server P • Dynamic binding – Client knows the server at invocation time – Server does not know clients a-priori • Communication infrastructure: message passing

Client-Server: Design • Client: usually light-weight • Server: huge computational load – Must provide the service and access resources (local DB) – Must handle concurrent interactions with multiple clients • Data integrity • Concurrent conflicting accesses • Users’ authentication, access management, privacy – Must be always ready to process new requests • Infinite process (daemon)

Protocol for C/S Interaction • A protocol is needed to specify the rules that must be observed by the client and the server during the conduction of a service – How the service is to be located – The sequence of inter-process communication • Who should speak first • Syntax and semantics of messages • Expected action when some message is received – The representation and interpretation of data exchanged with each IPC • On the Internet, such protocols are specified in the RFCs (by IETF) • Implementations: must adhere to the protocol

Client Strategies • Timeout – Expiration exception compensation • Compensation – Ask another server – Repeat request • How many times? Which timeout? • Wait vs give up vs polling • From synchronous to asynchronous forms of interaction • Note: impossible to distinguish – Different kinds of faults – Slow server could receive multiple versions of the same request!

Server Strategies • Must align to the clients’ strategy • In general: queue with clients’ requests – Sequentially served (iteration) • Client in timeout: delayed response absorbed by the communication infrastructure • Repeated (same) requests: must be recognized by the server – Only one service execution – Clients: must identify requests – Server: must maintain result until it is delivered to the client

Pull vs Push • Towards asynchronous models • Client starts the interaction: pull model – Clients decides whether to wait for the server or not • Push model: roles inverted for the response – Client makes a single, non-blocking request – Server executes the service and takes care of delivering the result to the client • Server becomes client of the client – Example: RSS feeds • Client register to the feed • The server takes care of delivering news to the client

MOM • Message-Oriented-Middleware • Message system: intermediary among separate, independent processes • Sender process: deposits a message with the message system – Once a message is sent, the sender is free to move on to other tasks • Message system: switch for messages – forwards it to a message queue/repository associated with the receiver (maintained by the MOM) • Receiver process: picks up the message from its queue when ready • Asynchronous and decoupled interaction!

Publish-Subscribe • MOM where messages carry topic/content • Subscribe: consumer process registers for receiving infos • Publish: producer process generate a new info – Notification server notifies subscribers multicast • Complete decoupling • Many-to-many

Connection(less) C/S • Connection-oriented interaction – Creation of a dedicated virtual communication channel before the message exchange • Order preserved! – E. g. : telephone calls – More reliable • Connectionless interaction – Isolated messages (no guarantees about ordering) – E. g. : mail system – More flexible • Choice: application + infrastructure constraints • Connections: requires resource allocation on both C/S – Static route: resource allocation on intermediate nodes – Dynamic routes

C/S: State Management • Who maintains the state of interaction? • Stateless interaction – Each message is self-contained and separate from the others – Stateless server: lightweight, fault-tolerant, reliable • Stateful interaction – Message depends from the previous interaction – Stateful server: efficiency (message size and execution

Stateful Server • E. g. sequential file read – get. Next. Line() Compact message • Server must create and track a session with the client – To be used for future interactions (recovery) – E. g. : at the “open” time • Server allocate resources for a possibly long timespan – Client could never close the session • High server complexity

Stateless Server • get. Line(line. Number) • Self-contained messages: carry the state as a parameter! – State maintained by the client + interaction – Simpler server • Only viable if operations are idempotent – The same operation always gives the same result – delete. Last. Line() vs. delete. Line(n)

Server Parallelism • Is the server able to execute multiple operations in parallel? • How are multiple clients’ requests served? • Sequential/iterative server: at most one request at a time – Queue of pending requests – Low resource exploitation (no overlapping between computation and I/O) • Concurrent server: serves multiple requests concurrently – A new request can be accepted when another request is being served – Better resource exploitation – Higher server complexity

Sequential Server requests from clients Communication Infrastructure service response to client Server process

Sequential Server requests from clients Tq Tc Communication Infrastructure Tr service Tc response to client • Perceived service time by the client: Tr = Ts + 2 Tc + Tq • Tq = Ts × average queue length • Limiting overhead: – Small queue – Drop requests when the queue is full Server process Ts

Concurrent Server (Multi-Process) Server requests from clients Communication Infrastructure Generate process Server process response to client service Chi. Ch Child process

Concurrent Server (Multi-Process) Server requests from clients Tq Tc Communication Infrastructure Tr Tc Generate process Tg Server process response to client service Ts Chi. Ch Child process Tr = Ts + 2 Tc + Tq + Tg • Saves time when – Intensive I/O – Multi-processor server

Recap: Server Types • Sequential server – Queue-related delays • Concurrent server – Small queue time – Overhead when using multiple (heavyweight) processes • Stateless server – No correlation with previous requests • Stateful server – Must remember previous requests • Connectionless server – Client requests are independent (out-of-order) • Connection-oriented server – Client requests are received in the same order they are issued

Thin Client CLIENT SERVER Presentation Logic Application Logic Database Logic

Fat Client CLIENT SERVER Presentation Logic Application Logic Database Logic

Multi-tier CLIENT SERVER 1 SERVER 2 Presentation Logic Application Logic Database Logic

Recap: Communication • Basic interaction scheme: asymmetric, synchronous, blocking • Asynchronous: no interest for the result • Non-blocking: no interest in waiting the result • Variations – Time out – Push-server

Peer-to-Peer • Architecture where computer resources and services are directly exchanged between computer systems – Resources and services include the exchange of information, processing cycles, cache storage, and disk storage for files. . • Computers that usually act as client communicate directly among themselves and can act as both clients and servers – Assuming whatever role is most efficient for the network • Participating processes – Equal roles – Equivalent capabilities – Equivalent responsibilities • Each participant may issue a request to another participant and receive a response

Peer-to-Peer • Applications – – instant messaging peer-to-peer file transfers video conferencing collaborative work

RPC • Remote Procedure Call: Call to a procedure stored in another (remote machine) – Technology-independence • Steps 1. Client: does a local procedure call to a client stub 2. Client stub • • packs the parameters into a message (marschalling) Sends the message to the server via a system call 3. Server machine: passes the incoming packets to the server stub 4. Server stub: calls the (local) server procedure and obtains a result 5. The reply traces the same steps in the reverse direction

The Distributed Objects Paradigms • The idea of applying object orientation to distributed applications is a natural extension of object-oriented software development • Applications access objects distributed over a network. • Objects provide methods, through the invocation of which an application obtains access to services. • Object-oriented paradigms include: – Remote method invocation (RMI) – Network services – Object request broker – Object spaces

RMI • Remote Method Invocation: object-oriented equivalent of RPC • A local object gets a reference to the remote object • It invokes operations on the remote object – Via a stub-skeleton approach – Proxy pattern – Abstraction • As with RPC, arguments may be passed with the invocation – Management of both values and object references

Take home • Definition of DS – Service provision with distributed black-box execution • Requirements – – Accessibility Transparency Openness Scalability • Examples of DS and level of satisfaction of the requirements • Client/Server architecture – synchronous versus asynchronous • Next topic: Threads