Chapter 2 Middleware Gustavo Alonso Computer Science Department

  • Slides: 24
Download presentation
Chapter 2: Middleware Gustavo Alonso Computer Science Department Swiss Federal Institute of Technology (ETHZ)

Chapter 2: Middleware Gustavo Alonso Computer Science Department Swiss Federal Institute of Technology (ETHZ) [email protected] ethz. ch http: //www. iks. inf. ethz. ch/

Contents - Chapter 2 o o o Understanding middleware Þ Middleware as a programming

Contents - Chapter 2 o o o Understanding middleware Þ Middleware as a programming abstraction Þ Middleware as infrastructure A quick overview of conventional middleware platforms Þ RPC Þ TP Monitors Þ Object brokers Middleware convergence ©Gustavo Alonso, ETH Zürich. 2

Programming abstractions o o Programming languages and almost any form of software system evolve

Programming abstractions o o Programming languages and almost any form of software system evolve always towards higher levels of abstraction Þ hiding hardware and platform details Þ more powerful primitives and interfaces Þ leaving difficult task to intermediaries (compilers, optimizers, automatic load balancing, automatic data partitioning and allocation, etc. ) Þ reducing the number of programming errors Þ reducing the development and maintenance cost of the applications developed by facilitating their portability Middleware is primarily a set of programming abstractions developed to facilitate the development of complex distributed systems Þ to understand a middleware platform one needs to understand its programming model Þ from the programming model the limitations, general performance, and applicability of a given type of middleware can be determined in a first approximation Þ the underlying programming model also determines how the platform will evolve and fare when new technologies evolve ©Gustavo Alonso, ETH Zürich. 3

The genealogy of middleware Application servers TP-Monitors Transactional RPC Object brokers Object oriented RPC

The genealogy of middleware Application servers TP-Monitors Transactional RPC Object brokers Object oriented RPC (RMI) Remote Procedure Call sockets TCP, UDP Internet Protocol (IP) ©Gustavo Alonso, ETH Zürich. Message brokers Asynchronous RPC Specialized forms of RPC, typically with additional functionality or properties but almost always running on RPC platforms Remote Procedure Call: hides communication details behind a procedure call and helps bridge heterogeneous platforms sockets: operating system level interface to the underlying communication protocols TCP, UDP: User Datagram Protocol (UDP) transports data packets without guarantees Transmission Control Protocol (TCP) verifies correct delivery of data streams Internet Protocol (IP): moves a packet of data from one node to another 4

And the Internet? And Java? o o Programming abstractions are a key part of

And the Internet? And Java? o o Programming abstractions are a key part of middleware but not the only one: Þ a programming abstraction without good supporting infrastructure (i. e. , a good implementation and support system underneath) does not help Programming abstractions, in fact, appear I many cases in reaction to changes in the underlying hardware or the nature of the systems being developed o Java is a programming language that abstracts the underlying hardware: programmers see only the Java Virtual Machine regardless of what computer they use Þ code portability (not the same as code mobility) Þ the first step towards standardizing middleware abstractions (since now the can be based on a virtual platform everybody agrees upon) o The Internet is a different type of network that requires one more specialization of existing abstractions: Þ The Simple Object Access Protocol (SOAP) of Web services is RPC wrapped in XML and mapped to HTML for easy transport through the Internet ©Gustavo Alonso, ETH Zürich. 5

Middleware as infrastructure client process client code language specific call interface client stub DCE

Middleware as infrastructure client process client code language specific call interface client stub DCE development environment IDL sources IDL compiler server process server code language specific call interface server stub RPC API interface headers RPC run time service library RPC protocols security service cell service RPC run time service library distributed file service thread service DCE runtime environment ©Gustavo Alonso, ETH Zürich. 6

Infrastructure o As the programming abstractions reach higher and higher levels, the underlying infrastructure

Infrastructure o As the programming abstractions reach higher and higher levels, the underlying infrastructure implementing the abstractions must grow accordingly Þ Additional functionality is almost always implemented through additional software layers Þ The additional software layers increase the size and complexity of the infrastructure necessary to use the new abstractions o The infrastructure is also intended to support additional functionality that makes development, maintenance, and monitoring easier and less costly Þ RPC => transactional RPC => logging, recovery, advanced transaction models, language primitives for transactional demarcation, transactional file system, etc. Þ The infrastructure is also there to take care of all the non-functional properties typically ignored by data models, programming models, and programming languages: performance, availability, recovery, instrumentation, maintenance, resource management, etc. ©Gustavo Alonso, ETH Zürich. 7

Understanding middleware To understand middleware, one needs to understand its dual role as programming

Understanding middleware To understand middleware, one needs to understand its dual role as programming abstraction and as infrastructure PROGRAMMING ABSTRACTION o o o Intended to hide low level details of hardware, networks, and distribution Trend is towards increasingly more powerful primitives that, without changing the basic concept of RPC, have additional properties or allow more flexibility in the use of the concept Evolution and appearance to the programmer is dictated by the trends in programming languages (RPC and C, CORBA and C++, RMI and Java, Web services and SOAP-XML) ©Gustavo Alonso, ETH Zürich. INFRASTRUCTURE o o Intended to provide a comprehensive platform for developing and running complex distributed systems Trend is towards service oriented architectures at a global scale and standardization of interfaces Another important trend is towards single vendor software stacks to minimize complexity and streamline interaction Evolution is towards integration of platforms and flexibility in the configuration (plus autonomic behavior) 8

Basic middleware: RPC o o One cannot expect the programmer to implement a complete

Basic middleware: RPC o o One cannot expect the programmer to implement a complete infrastructure for every distributed application. Instead, one can use an RPC system (our first example of low level middleware) What does an RPC system do? Þ Hides distribution behind procedure calls Þ Provides an interface definition language (IDL) to describe the services Þ Generates all the additional code necessary to make a procedure call remote and to deal with all the communication aspects Þ Provides a binder in case it has a distributed name and directory service system ©Gustavo Alonso, ETH Zürich. CLIENT call to remote procedure CLIENT stub procedure Bind Marshalling Send SERVER stub procedure Unmarshalling Return SERVER remote procedure Client process Communication module Dispatcher (select stub) Server process 9

What can go wrong here? o o ©Gustavo Alonso, ETH Zürich. Server 2 (products)

What can go wrong here? o o ©Gustavo Alonso, ETH Zürich. Server 2 (products) New_product Lookup_product Delete_product Update_product Products database Server 3 (inventory) Place_order Cancel_order Update_inventory Check_inventory DBMS o RPC is a point to point protocol in the sense that it supports the interaction between two entities (the client and the server) When there are more entities interacting with each other (a client with two servers, a client with a server and the server with a database), RPC treats the calls as independent of each other. However, the calls are not independent Recovering from partial system failures is very complex. For instance, the order was placed but the inventory was not updated, or payment was made but the order was not recorded … Avoiding these problems using plain RPC systems is very cumbersome DBMS o INVENTORY CONTROL CLIENT Lookup_product Check_inventory IF supplies_low THEN Place_order Update_inventory. . . Inventory and order database 10

Transactional RPC o o The solution to this limitation is to make RPC calls

Transactional RPC o o The solution to this limitation is to make RPC calls transactional, that is, instead of providing plain RPC, the system should provide TRPC What is TRPC? Þ same concept as RPC plus … Þ additional language constructs and run time support (additional services) to bundle several RPC calls into an atomic unit Þ usually, it also includes an interface to databases for making end-to-end transactions using the XA standard (implementing 2 Phase Commit) Þ and anything else the vendor may find useful (transactional callbacks, high level locking, etc. ) o Simplifying things quite a bit, one can say that, historically, TP-Monitors are RPC based systems with transactional support. We have already seen an example of this: Encina Distributed Applications Encina Monitor Structured File Service Encina Peer to Peer Comm Reliable Queuing Service Encina Toolkit OSF DCE ©Gustavo Alonso, ETH Zürich. 11

TP-Monitors INVENTORY CONTROL IF supplies_low THEN o BOT Place_order Update_inventory EOT Server 2 (products)

TP-Monitors INVENTORY CONTROL IF supplies_low THEN o BOT Place_order Update_inventory EOT Server 2 (products) New_product Lookup_product Delete_product Update_product o Server 3 (inventory) Place_order Cancel_order Update_inventory Check_inventory Products database DBMS o ©Gustavo Alonso, ETH Zürich. Inventory and order database The design cycle with a TP-Monitor is very similar to that of RPC: Þ define the services to implement and describe them in IDL Þ specify which services are transactional Þ use an IDL compiler to generate the client and server stubs Execution requires a bit more control since now interaction is no longer point to point: Þ transactional services maintain context information and call records in order to guarantee atomicity Þ stubs also need to support more information like transaction id and call context Complex call hierarchies are typically implemented with a TPMonitor and not with plain RPC 12

TP-Monitor Example Interfaces to user defined services Programs implementing the services Yearly balance ?

TP-Monitor Example Interfaces to user defined services Programs implementing the services Yearly balance ? Monthly average revenue ? TP-Monitor environment Control (load balancing, cc and rec. , replication, distribution, scheduling, priorities, monitoring …) app server 2 user program app server 1’ user program app server 1 Front end recoverable queue app server 3 wrappers Branch 1 ©Gustavo Alonso, ETH Zürich. Branch 2 Finance Dept. 13

TP-Heavy vs. TP-Light = 2 tier vs. 3 tier o o o A TP-heavy

TP-Heavy vs. TP-Light = 2 tier vs. 3 tier o o o A TP-heavy monitor provides: Þ a full development environment (programming tools, services, libraries, etc. ), Þ additional services (persistent queues, communication tools, transactional services, priority scheduling, buffering), Þ support for authentication (of users and access rights to different services), Þ its own solutions for communication, replication, load balancing, storage management. . . (similar to an operating system). Its main purpose is to provide an execution environment for resource managers (applications), with guaranteed reasonable performance This is the traditional monitor: CICS, Encina, Tuxedo. ©Gustavo Alonso, ETH Zürich. o o A TP-Light is a database extension: Þ it is implemented as threads, instead of processes, Þ it is based on stored procedures ("methods" stored in the database that perform an specific set of operations) and triggers, Þ it does not provide a development environment. Light Monitors are appearing as databases become more sophisticated and provide more services, such as integrating part of the functionality of a TP-Monitor within the database. Instead of writing a complex query, the query is implemented as a stored procedure. A client, instead of running the query, invokes the stored procedure. Stored procedure languages: Sybase's Transact-SQL, Oracle's PL/SQL. 14

Databases and the 2 tier approach o client o database management system o Database

Databases and the 2 tier approach o client o database management system o Database developing environment user defined application logic o database external application Databases are traditionally used to manage data. However, simply managing data is not an end in itself. One manages data because it has some concrete application logic in mind. This is often forgotten when considering databases. But if the application logic is what matters, why not move the application logic into the database? These is what many vendors are advocating. By doing this, they propose a 2 tier model with the database providing the tools necessary to implement complex application logic. These tools include triggers, replication, stored procedures, queuing systems, standard access interfaces (ODBC, JDBC). resource manager ©Gustavo Alonso, ETH Zürich. 15

CORBA o o o The Common Object Request Broker Architecture (CORBA) is part of

CORBA o o o The Common Object Request Broker Architecture (CORBA) is part of the Object Management Architecture (OMA) standard, a reference architecture for component based systems The key parts of CORBA are: Þ Object Request Broker (ORB): in charge of the interaction between components Þ CORBA services: standard definitions of system services Þ A standardized IDL language for the publication of interfaces Þ Protocols for allowing ORBs to talk to each other CORBA was an attempt to modernize RPC by making it object oriented and providing a standard ©Gustavo Alonso, ETH Zürich. Client (CORBA object) client stub (proxy) CORBA library Server (CORBA object) interface to remote calls server stub (skeleton) Marshalling serialization CORBA Basic Object Adaptor Object Request Broker (ORB) CORBA services 16

CORBA follows the RPC model o o o CORBA follows the same model as

CORBA follows the RPC model o o o CORBA follows the same model as RPC : Þ they are trying to solve the same problem Þ CORBA is often implemented on top of RPC Unlike RPC, however, CORBA proposes a complete architecture and identifies parts of the system to much more detail than RPC ever did (RPC is an inter-process communication mechanism, CORBA is a reference architecture that includes an inter-process communication mechanism) CORBA standardized component based architectures but many of the concepts behind were already in place long ago ©Gustavo Alonso, ETH Zürich. o o Development is similar to RPC: Þ define the services provided by the server using IDL (define the server object) Þ compile the definition using an IDL compiler. This produces the client stub (proxy, server proxy, proxy object) and the server stub (skeleton). The method signatures (services that can be invoked) are stored in an interface repository Þ Program the client and link it with its stub Þ Program the server and link it with its stub Unlike in RPC, the stubs make client and server independent of the operating system and programming language 17

Objects everywhere: IIOP and GIOP o o o In order for ORBs to be

Objects everywhere: IIOP and GIOP o o o In order for ORBs to be a truly universal component architecture, there has to be a way to allow ORBs to communicate with each other (one cannot have all components in the world under a single ORB) For this purpose, CORBA provides a General Inter-ORB Protocol (GIOP) that specifies how to forward calls from one ORB to another and get the requests back The Internet Inter-ORB Protocol specifies how GIOP messages are translated into TCP/IP There additional protocols to allow ORBs to communicate with other systems The idea was sound but came too late and was soon superseded by Web services ©Gustavo Alonso, ETH Zürich. Client (CORBA object) Server (CORBA object) ORB 1 ORB 2 GIOP IIOP Internet (TCP/IP) 18

The best of two worlds: Object Monitors Middleware technology should be interpreted as different

The best of two worlds: Object Monitors Middleware technology should be interpreted as different stages of evolution of an “ideal” system. Current systems do not compete with each other per se, they complement each other. The competition arises as the underlying infrastructures converge towards a single platform: o o OBJECT REQUEST BROKERS (ORBs): Reuse and distribution of components via an standard, object oriented interface and number of services that add semantics to the interaction between components. TRANSACTION PROCESSING MONITORS: An environment to develop components capable of interacting transactionally and the tools necessary to maintain transactional consistency And Object Transaction Monitors? Object Monitor = ORB + TP-Monitor ©Gustavo Alonso, ETH Zürich. 19

Conventional middleware today o o RPC and the model behind RPC are at the

Conventional middleware today o o RPC and the model behind RPC are at the core of any middleware platform, even those using asynchronous interaction RPC, however, has become part of the low level infrastructure and it is rarely used directly by application developers o TP-Monitors are still as important as they have been in the past decades but they have become components in larger systems and hidden behind additional layers intended for enterprise application integration and Web services. Like RPC, the functionality of TP-Monitors is starting to migrate to the low levels of the infrastructure and becoming invisible to the developer o CORBA is being replaced by other platforms although its ideas are still being used and copied in new systems. CORBA suffered from three developments that changed the technology landscape: the quick adoption of Java and the Java Virtual Machine, the Internet and the emergence of the Web, the raise of J 2 EE and related technologies to an almost de-facto standard for middleware ©Gustavo Alonso, ETH Zürich. 20

Middleware convergence o o In practice, one always needs more than one type of

Middleware convergence o o In practice, one always needs more than one type of middleware. The question is what is offered by each product. Existing systems implement a great deal of overlapping functionality: what in CORBA are called the services RPC o runtime App. wrappers engine platform support Name repository services Because of this overlapping functionality, there are many possible combinations. Some of them work, some don’t. In many cases the focus is on the overlapping functionality, not on the key aspects of a system ©Gustavo Alonso, ETH Zürich. 21

Interchangeable Functionality WF engine RPC TP monitor RPC o o runtime App. wrappers engine

Interchangeable Functionality WF engine RPC TP monitor RPC o o runtime App. wrappers engine platform CORBAsupport Name repository services WF engine RPC CORBA RPC runtime App. wrappers engine platform TP-Monitor support Name repository services That all these combinations are possible does not make they all make sense In an integrated environment, this functionality should be incorporated not by plugging heavy, stand-alone components but by designing a coherent system from the beginning. This is not always feasible nowadays. ©Gustavo Alonso, ETH Zürich. 22

RPC ©Gustavo Alonso, ETH Zürich. runtime App. wrapper s engine platform Name support services

RPC ©Gustavo Alonso, ETH Zürich. runtime App. wrapper s engine platform Name support services repository . wrappers p p A runtime platform engine support Name pository e r services RPC RPC runtime App. wrappers engine platform support Name repository services ru nt en im gi e Na ne Ap se m p. rv e ice wr pl s atf app er su o s pp rm re po or t sit or y System design nowadays 23

“Ideal” System transaction object management process management message management data management COMMON INFRASTRUCTURE ©Gustavo

“Ideal” System transaction object management process management message management data management COMMON INFRASTRUCTURE ©Gustavo Alonso, ETH Zürich. 24