Intelligent Information Systems 6 Software Composition Gio Wiederhold

  • Slides: 58
Download presentation
Intelligent Information Systems 6. Software Composition Gio Wiederhold EPFL, Research performed jointly with Dorothea

Intelligent Information Systems 6. Software Composition Gio Wiederhold EPFL, Research performed jointly with Dorothea Beringer April-June 2000, at 14: 15 - 15: 15, room INJ 218 9/25/2020 EPFL 6 S - Gio spring 2000 1

Schedule Presentations in English -- but I'll try to manage discussions in French and/or

Schedule Presentations in English -- but I'll try to manage discussions in French and/or German. • I plan to cover the material in an integrating fashion, drawing from concepts in databases, artificial intelligence, software engineering, and business principles. 1. 13/4 Historical background, enabling technology: ARPA, Internet, DB, OO, AI. , IR 2. 27/4 Search engines and methods (recall, precision, overload, semantic problems). 3. 4/5 Digital libraries, information resources. Value of services, copyright. 4. 11/5 E-commerce. Client-servers. Portals. Payment mechanisms, dynamic pricing. 5. 19/5 Mediated systems. Functions, interfaces, and standards. Intelligence in processing. Role of humans and automation, maintenance. 6. 26/5 Software composition. Distribution of functions. Parallelism. [ww D. Beringer] 7. 31/5 Application to Bioinformatics. 8. 15/6 Educational challenges. Expected changes in teaching and learning. 9. 22/6 Privacy protection and security. Security mediation. 10. 29/6 Summary and projection for the future. • Feedback and comments are appreciated. 9/25/2020 EPFL 6 S - Gio spring 2000 2

Software Composition CHAIMS: Compiling High-level Access Interfaces for Multi-site Software Gio Wiederhold and Dorothea

Software Composition CHAIMS: Compiling High-level Access Interfaces for Multi-site Software Gio Wiederhold and Dorothea Beringer Stanford University www-db. stanford. edu/CHAIMS 9/25/2020 EPFL 6 S - Gio spring 2000 3

Introduction Objective: Investigate revolutionary approaches to large-scale software composition. Approach: Develop & validate a

Introduction Objective: Investigate revolutionary approaches to large-scale software composition. Approach: Develop & validate a composition-only language. Contributions and plans: • Hardware and software platform independence. • Asynchrony by splitting up CALL-statement. • Performance optimization by invocation scheduling. • Potential for multi-site dataflow optimization. Apply technologies from Distributed Database paradigm to Software Distribution 9/25/2020 EPFL 6 S - Gio spring 2000 4

Participants • Support – – – DARPA ISO EDCS program (1996 -1999) Siemens Corporate

Participants • Support – – – DARPA ISO EDCS program (1996 -1999) Siemens Corporate Research (1996 -1998) Do. D AFOSR AASERT student support (1997 -1999) Sloan Foundation - computer industry study (1996 -97) CIA fellowship (2000 -2002) • People Gio Wiederhold (Prof. Res) PI - Marianne Siroker (Administration) Dorothea Beringer (Postdoc, Res. Ass. ; Ph. D EPF Lausanne) Dec. 1997 -Dec. 1999 Neil Sample (CS Ph. D Student) -- Laurence Melloul (CS MS, Ph. D student) Prasenjit Mitra (EE MS, now EE Ph. D student SKC project) graduated: Ron Burback (Ph. D. ); Joshua Hui, Gaurav Bhatia, Kirti Kwatra, Prasanna Ramaswami, Pankaj Jain, Mehul Bastawala, Catherine Tornabene (MS CS); Wayne Lim (MS, I. E. ), Connan King (E. E. BS), Woody Pollack (CS MS). – Louis Perrochon (postdoc ETH Zurich) Fall quarter 1996 – – – 9/25/2020 EPFL 6 S - Gio spring 2000 5

Observed Shift in Programming Tasks Integration Coding 1970 9/25/2020 1990 EPFL 6 S -

Observed Shift in Programming Tasks Integration Coding 1970 9/25/2020 1990 EPFL 6 S - Gio spring 2000 2010? 6

Hypotheses • After the Y 2 K effort no large software applications will be

Hypotheses • After the Y 2 K effort no large software applications will be written from the ground up. They will always be composed using existing legacy code. • Composition requires functionalities not available in current mainstream programming languages. • Large-scale distributed systems enable & require optimizations that differ from code optimizers. • Composition programmers will use different tools from base programmers. (type A versus type B -- [Belady] 9/25/2020 EPFL 6 S - Gio spring 2000 7

Languages & Interfaces • Large languages intended to support coding as well as composition

Languages & Interfaces • Large languages intended to support coding as well as composition have not been adopted – – Algol 68 PL/1 Ada CLOS in use: C, C++, Fortran, Java • Databases are being successfully composed, using Client-server, Mediator architectures – – distribution -- exploit network capabilities heterogeneity -- autonomy creates heterogneity simple schemas -- some human interpretation service model -- public and commercial sources 9/25/2020 EPFL 6 S - Gio spring 2000 8

Typical Scenario: Logistics A general has to ship troops and/or various material from San

Typical Scenario: Logistics A general has to ship troops and/or various material from San Diego NOSC to Washington DC: – – – different kind of material: criteria for preferred transport differ not every airport equally suited congestion, prices actual weather certain due or ready dates Today: calling different companies, looking up information on the web, reservations by hand Tomorrow: system proposes possibilities that take into account various conditions 9/25/2020 • hand-coded systems • composition of EPFL 6 S processes - Gio spring 2000 9

Scaling alternatives ? 9/25/2020 EPFL 6 S - Gio spring 2000 10

Scaling alternatives ? 9/25/2020 EPFL 6 S - Gio spring 2000 10

CHAIMS (: Client program for composition, written by domain specialist Compiler & templates automate

CHAIMS (: Client program for composition, written by domain specialist Compiler & templates automate generation of client code for distributed system CHAIMS Services 9/25/2020 EPFL 6 S - Gio spring 2000 Transport protocols access services Services & Interfaces megamodules provided by various suppliers 11

Megamodules - Definition Megamodules are large, autonomous, distributed, heterogeneous services or processes. • large:

Megamodules - Definition Megamodules are large, autonomous, distributed, heterogeneous services or processes. • large: computation intensive, data intensive, ongoing processes (example: monitoring services) • distributed: to be used by more than one client • heterogeneous: accessible by various distribution protocols (not only different languages and systems) • autonomous: maintenance and control over recourses remains with provider, differing ontologies ( ==> SKC project) Examples: – logistics: “find best transportation route from A to B”, reservation systems – genomics: easier framework for composing various processing tools than ad-hoc coding 9/25/2020 EPFL 6 S - Gio spring 2000 12

Mega-programming Process megaprogrammer Mega-program Text Module / platform descriptions Feedback CHAIMS compiler customer Native

Mega-programming Process megaprogrammer Mega-program Text Module / platform descriptions Feedback CHAIMS compiler customer Native Module Wrapper / API Legacy. Wrapper Module / API to be composed Modules to be/ API Wrapper composed Legacy Module to be composed 9/25/2020 Megaprogram IO module Transport mechanisms EPFL 6 S - Gio spring 2000 13

Integrated systems: Fat Clients Domain expert Control & Computation Services I/O Client computer a

Integrated systems: Fat Clients Domain expert Control & Computation Services I/O Client computer a b c d e I/O Wrappers to resolve differences Data Resources 9/25/2020 EPFL 6 S - Gio spring 2000 14

Desired: Modest Clients Domain expert IO module Client workstation C Computation Services b a

Desired: Modest Clients Domain expert IO module Client workstation C Computation Services b a Sites R T e MEGA modules S c U d T Data Resources 9/25/2020 EPFL 6 S - Gio spring 2000 15

Valuable Services are not free for a client: – execution time of a service

Valuable Services are not free for a client: – execution time of a service – transfer time for data – fees for services What the client applications need: ø monitoring progress of a service ø possibility to choose among equivalent services based on estimated waiting time and fees ø high performance due to parallelism among services ø preliminary overview results, choosing level of accuracy / number of results for complex processes ø effective optimization techniques 9/25/2020 EPFL 6 S - Gio spring 2000 16

Empower Non-technical Domain Experts Company providing services: – domain experts of domain of service

Empower Non-technical Domain Experts Company providing services: – domain experts of domain of service (e. g. weather) – technical experts for programming for distribution protocols, setting up servers in a middleware system – marketing experts “Composer of Megaprograms”: – is domain expert of domain that uses these services – is not technical expert of middleware system or experienced programmer, – wants to focus on problem at hand (=results of using megaprogram) e. g. , scientist, logistics officer, . . . 9/25/2020 EPFL 6 S - Gio spring 2000 17

A purely compositional language? Which languages did succeed? – Algol, ADA: integrated composition and

A purely compositional language? Which languages did succeed? – Algol, ADA: integrated composition and computation – C, C++ focus on computation Why a new language? – complexity: not all facilities of a common language (compare to approach of Java), – inhibiting traditional computational programming (compare C++ and Smalltalk concerning object-oriented programming) – focus on issue of composition, parallelism by natural asynchrony, and novel optimizations 9/25/2020 EPFL 6 S - Gio spring 2000 18

CHAIMS Physical Architecture Network complex calls Megaprogram Clients in CHAIMS CORBA, JAVA RMI, DCE,

CHAIMS Physical Architecture Network complex calls Megaprogram Clients in CHAIMS CORBA, JAVA RMI, DCE, DCOM. . . Megamodules (wrapped, native) each supporting setup, estimate, invoke, examine, extract, and terminate. 9/25/2020 EPFL 6 S - Gio spring 2000 19

Decomposing CALL statements CALL gained functionality progress in scale of computing Copying Code sharing

Decomposing CALL statements CALL gained functionality progress in scale of computing Copying Code sharing Parameterized computation CHAIMS decomposes CALL functions Objects with overloaded method names Remote procedure calls to distributed modules Constrained (black box) access to encapsulated data Setup 9/25/2020 Estimate Invoke EPFL 6 S - Gio spring 2000 Examine Extract 20

CHAIMS Megaprogr. Language Purely compositional: – only variety of CALLs and control flow –

CHAIMS Megaprogr. Language Purely compositional: – only variety of CALLs and control flow – no primitives for input/output ==> instead use general and problemspecific I/O megamodules – no primitives for arithmetic ==> use math megamodules Splitting up CALL-statement: – parallelism by asynchrony in sequential program – novel possibilities for optimizations – reduction of complexity of integrated invoke statements • Very high-level language just as: assembler HLLs, HLLs composition/megamodule paradigm 9/25/2020 EPFL 6 S - Gio spring 2000 21

CHAIMS Primitives Pre-invocation: SETUP: set up the connection to a megamodule SET-, GETATTRIBUTES: set

CHAIMS Primitives Pre-invocation: SETUP: set up the connection to a megamodule SET-, GETATTRIBUTES: set global parameters in a megamodule ESTIMATE: get estimate of execution time for optimization Invocation and result gathering: INVOKE: start a specific method EXAMINE: test status of an invoked method EXTRACT: extract results from an invoked method Termination: TERMINATE: terminate a method invocation or a connection to a megamodule Control: WHILE, IF 9/25/2020 Utility: GETPARAM: get default parameters EPFL 6 S - Gio spring 2000 22

Megaprogram Example: Input. Output - Input - Output General I/O-megamodule • Input function takes

Megaprogram Example: Input. Output - Input - Output General I/O-megamodule • Input function takes as parameter a default data structure containing names, types and default values for expected input Travel information: Route. Info Air. Ground - All. Routes - City. Pair. List -. . . - Cost. For. Ground - Cost. For. Air -. . . • Computing all possible routes between two cities • Computing the air and ground cost for each leg given a list of city-pairs and data about the goods to be transported Two megamodules that offer equivalent functions for calculating optimal routes Routing Route. Optimizer - Best. Route -. . . - Optimum -. . . 9/25/2020 • Optimum and Best. Route both calculate the optimum route given routes and costs • Global variables: Optimization can be done for cost or for time EPFL 6 S - Gio spring 2000 23

Megaprogram Example: Code io_mmh = SETUP ("Input. Output") route_mmh = SETUP ("Route. Info"). .

Megaprogram Example: Code io_mmh = SETUP ("Input. Output") route_mmh = SETUP ("Route. Info"). . . best 2_mmh. SETATTRIBUTES (criterion = "cost") // Setup connections to megamodules. // Set global variables valid for all invocations // of this client. // Get information from the megaprogram user cities_default = route_mmh. GETPARAM(Pair_of_Cities) input_cities_ih = io_mmh. INVOKE ("input”, cities_default) // about the goods to be transported and about WHILE (input_cities_ih. EXAMINE() != DONE) {} // the two desired cities = input_cities_ih. EXTRACT() // Get all routes between the two cities. . route_ih = route_mmh. INVOKE ("All. Routes", Pair_of_Cities = cities) WHILE (route_ih. EXAMINE() != DONE) {} //Get all city pairs in these routes = route_ih. EXTRACT() //Calculate the costs of all the routes. … // Figure out the optimal megamodule for // picking the best route. IF (best 1_mmh. ESTIMATE("Best_Route") < best 2_mmh. ESTIMATE("Optimum") ) THEN {best_ih = best 1_mmh. INVOKE ("Best_Route", Goods = info_goods, Pair_of_Cities = cities, List_of_Routes = routes, // Pick the best route and Cost_Ground = cost_list_ground, Cost_Air = cost_list_air)} / /display the result. ELSE {best_ih = best 2_mmh. INVOKE ("Optimum", Goods = info_goods, …. . . // Terminate all invocations best 2_mmh. TERMINATE() 9/25/2020 EPFL 6 S - Gio spring 2000 24

Operation of one Megamodule M handle • SETUP M handle • SETATTRIBUTES provides context

Operation of one Megamodule M handle • SETUP M handle • SETATTRIBUTES provides context M handle • ESTIMATE serves scheduling M handle • INVOKE initiates remote computation I handle • EXAMINE checks for completion I handle • EXTRACT obtains results • TERMINATE I / ALL I handle M handle 9/25/2020 EPFL 6 S - Gio spring 2000 25

Creation Process Megamodule Provider provides native or wraps legacy megamodules Repository Browser Wrapper Templates

Creation Process Megamodule Provider provides native or wraps legacy megamodules Repository Browser Wrapper Templates adds information to CHAIMS Repository b d a e c MEGA Modules 9/25/2020 EPFL 6 S - Gio spring 2000 26

Composition Process Composer reads information Composition Wizard Repository Browser writes Composition Wizard Megaprogram starts

Composition Process Composer reads information Composition Wizard Repository Browser writes Composition Wizard Megaprogram starts CHAIMS Compiler CHAIMS Repository generates CSRT (compiled megaprogram) 9/25/2020 EPFL 6 S - Gio spring 2000 27

Runtime Architecture Client End-user b CSRT (compiled megaprogram) a d e c MEGA modules

Runtime Architecture Client End-user b CSRT (compiled megaprogram) a d e c MEGA modules IO module(s) Distribution System (CORBA, RMI…) 9/25/2020 EPFL 6 S - Gio spring 2000 28

Architecture: Overview Composer writes info rma tion Megaprogram CHAIMS Compiler tion adds information to

Architecture: Overview Composer writes info rma tion Megaprogram CHAIMS Compiler tion adds information to Wrapper Templates CHAIMS Repository Client End-user ma infor b generates CSRT Megamodule Provider a d e c MEGA modules IO module(s) Distribution System 9/25/2020 EPFL 6 S - Gio spring 2000 29

Multiple Transport Protocols The CHAIMS API defines interface between the composer and the client

Multiple Transport Protocols The CHAIMS API defines interface between the composer and the client megaprogram; the megaprogram is written in the CHAIMS language. The CHAIMS protocols define the calls the megamodules have to understand. These protocols are slightly different for the different distribution protocols, and are defined by an idl for CORBA, another idl for DCE, and a Java class for RMI. 9/25/2020 Composer CHAIMS - language Megaprogram CHAIMS-protocols CORBA-idl DCE-idl Java-class M e g a m o d u l e s EPFL 6 S - Gio spring 2000 30

Data objects: Blobs Minimal Typing within CHAIMS: Integer, boolean only for control All else

Data objects: Blobs Minimal Typing within CHAIMS: Integer, boolean only for control All else is placed into Binary Large OBjects (Blobs), transparent to compiler : Alternatives • ASN. 1, with conversion routines (now) • XML, with interpretation (next) Example: Person_Information Name of Person complex First Name string Personal Data complex Joe Date of Birth 9/25/2020 Last Name date 6/21/54 string Smith Soc. Sec. No EPFL 6 S - Gio spring 2000 Address string 345 -34 -345 31

Wrapper: CHAIMS Compliance CHAIMS protocol - support all CHAIMS primitives – if not native,

Wrapper: CHAIMS Compliance CHAIMS protocol - support all CHAIMS primitives – if not native, achieved by wrapping legacy codes • State management and asynchrony: • client. Id (megamodule handle in CHAIMS language) • call. Id (invocation handle in CHAIMS language) • results must be stored for possible extraction(s) until termination of the invocation • Data transformation: • all parameters of type blob (BER-encoded Gentype) must be converted into the megamodule specific data types (coding/decoding routines) 9/25/2020 EPFL 6 S - Gio spring 2000 32

Architecture: Three Views Composition View (megaprogram) CHAIMS Layer - composition of megamodules - directing

Architecture: Three Views Composition View (megaprogram) CHAIMS Layer - composition of megamodules - directing of opaque data blobs Distribution Layer Objective: 9/25/2020 Transport View Dat - ex a. V cha iew nge of d - int ata erp r e data tati on o f - in/ betw meg een amo dule s moving around data blobs and CHAIMS messages Clear separation between composition of services, computation over data, and transport EPFL 6 S - Gio spring 2000 33

time Scheduler: Decomposed Execution s, i e e synchronous asynchronous i e decomposed (no

time Scheduler: Decomposed Execution s, i e e synchronous asynchronous i e decomposed (no benefit for one module) execution of a remote method available for other methods 9/25/2020 s s i e EPFL 6 S - Gio spring 2000 setup / set attributes invoke a method extract results 34

Optimized Execution of Modules time i 2 i 3 e 1 M 2 e

Optimized Execution of Modules time i 2 i 3 e 1 M 2 e 2 time i 1 M 3 i 4 e 3 e 4 i 5 e 5 9/25/2020 M 1 M 2 M 3 M 4 (>M 1+M 2) (<M 1+M 2) M 5 optimized by scheduler M 4 according to estimates data dependencies execution of a module M 5 non-optimized i 3 i 1 i 4 e 1 i 2 e 4 e 3 e 2 i 5 e 5 i e invoke a method extract results EPFL 6 S - Gio spring 2000 35

Decomposed Parallel Execution M 1 optimized by scheduler according to estimates 9/25/2020 M 4

Decomposed Parallel Execution M 1 optimized by scheduler according to estimates 9/25/2020 M 4 M 3 <M 1+M 2) M 2 (<M 1+M 2) time Long setup times occur, for instance, when a subset of a large database has to be loaded for a simple search, say Transatlantic fights for an optimal arrival. M 5 set up / set attributes invoke a method extract results EPFL 6 S - Gio spring 2000 36

Scheduling: Simple Example 1 cost_ground_ih = cost_mmh. INVOKE ("Cost_for_Ground", List_of_City_Pairs = city_pairs, Goods =

Scheduling: Simple Example 1 cost_ground_ih = cost_mmh. INVOKE ("Cost_for_Ground", List_of_City_Pairs = city_pairs, Goods = info_goods) 1 2 WHILE (cost_ground_ih. EXAMINE() != DONE) {} cost_list_ground = cost_ground_ih. EXTRACT() 3 3 cost_air_ih = cost_mmh. INVOKE ("Cost_for_Air", List_of_City_Pairs = city_pairs, Goods = info_good) 2 4 WHILE (cost_air_ih. EXAMINE() != DONE) {} cost_list_air = cost_air_ih. EXTRACT() 4 order in automatically prescheduled megaprogram order in unscheduled megaprogram 9/25/2020 EPFL 6 S - Gio spring 2000 37

M 2 M 3 (>M 1+M 2) M 3 M 4 (<M 1+M 2)

M 2 M 3 (>M 1+M 2) M 3 M 4 (<M 1+M 2) (>M 1+M 2) M 2 prior time M 1 time Decomposed Optimized Execution M 5 optimized by scheduler according to estimates 9/25/2020 set up / set attributes invoke a method extract results EPFL 6 S - Gio spring 2000 38

Iterated Invocations M 6. 1 M 6. 3 M 6. 2 M 6. 3

Iterated Invocations M 6. 1 M 6. 3 M 6. 2 M 6. 3 time M 6. 2 prior time M 6. 1 Avoid repeated setups M 6. 4 M 6. 5 M 6. 4 set up / set attributes invoke a method extract results M 6. 5 9/25/2020 EPFL 6 S - Gio spring 2000 39

M 6. 1 M 6. 5 M 6. 4 time, shared setup M 6.

M 6. 1 M 6. 5 M 6. 4 time, shared setup M 6. 4 M 6. 3 prior time, disibct invoctions M 6. 2 M 6. 3 M 6. 1 M 6. 2 M 6. 3 M 6. 4 M 6. 5 9/25/2020 EPFL 6 S - Gio spring 2000 time, shared setup & partial extract & Repeated Extractions Avoid large exacts until satisfied set up / set attributes invoke a method extract results partial for iterating full for presentation 40

Scheduling: Heuristics INVOKES: call INVOKE’s as soon as possible • may depend on other

Scheduling: Heuristics INVOKES: call INVOKE’s as soon as possible • may depend on other data • moving it outside of an if-block: depending on cost-function (ESTIMATE of this and following functions concerning execution time, dataflow and fees (resources). EXTRACT: move EXTRACT’s to where the result is actually needed • no sense of checking/waiting for results before they are needed • instead of waiting, polling all invocations and issue next possible invocation as soon as data could be extracted TERMINATE: terminate invocations that are no longer needed (save resources) • not every method invocation has an extract (e. g. print-like functions) 9/25/2020 EPFL 6 S - Gio spring 2000 41

Compiling into a Network? current CHAIMS system Mega Program Mega Module B Module F

Compiling into a Network? current CHAIMS system Mega Program Mega Module B Module F Module D Module A Module C Module E with distribution dataflow optimization Mega Program Module B Module F Module D Module A Module C Module E control flow 9/25/2020 EPFL 6 S - Gio spring 2000 data flow 42

CHAIMS Implementation • Specify minimal language – minimal functions: CALLs, While, If * –

CHAIMS Implementation • Specify minimal language – minimal functions: CALLs, While, If * – minimal typing {boolean, integer, string, handles, object} • objects encapsulated using ASN. 1 standard – type conversion in wrappers, service modules* • • Compiler for multiple protocols (one-at-time, mixed 2, all*) Wrapper generation for multiple protocols Native modules for I/O, simple mathematics, other Implement API for CORBA, Java RMI, DCE, DCOM * usage Wrap / construct several programs for simple demos Schedule optimization * Demonstrate use in heterogeneous setting Full-scale demonstration * in process 9/25/2020 EPFL 6 S - Gio spring 2000 43

Research Questions • Is a Megaprogramming language focusing only on composition feasible? • Can

Research Questions • Is a Megaprogramming language focusing only on composition feasible? • Can it exploit on-going progress in client-server models and be protocol independent? • Can natural parallelism for distributed services be effectively scheduled? • Can high-level dataflow among distributed modules be optimized? • Can CHAIMS express clearly a high-level distributed SW architecture? • Can the approach affect SW process concepts and practice? 9/25/2020 EPFL 6 S - Gio spring 2000 44

Questions not addressed • Will one Client/Server protocol subsume all others? – distributed optimization

Questions not addressed • Will one Client/Server protocol subsume all others? – distributed optimization remains an issue • Synchronization / Concurrency Control – autonomy of sources negates current concepts – if modules share databases, then database locks may span setup/terminate all for a megaprogram handle. • Will software vendors consider moving to a service paradigm? – need CHAIMS demonstration for evaluation 9/25/2020 EPFL 6 S - Gio spring 2000 45

CHAIMS proves that. . . • We can do composition in a high-level language.

CHAIMS proves that. . . • We can do composition in a high-level language. • same language for Java-RMI-invocations and CORBA-invocations (and DCE, DCOM, TCP/IP protocols) • (single megaprogram can deal with multiple protocols simultaneously) • multiple megamodules can run in parallel • Large-scale composition can be automated. • in contrast to manual non-software composition (e. g. telephone, cut&paste) • in contrast to fixed programs for one specific problem (e. g. transporting military goods within US) • We can do schedulings of programs in a way right now only smart logistics officers can do, avoiding unnecessary waits. • Scheduling of invocations can be optimized. 9/25/2020 EPFL 6 S - Gio spring 2000 46

Backup slides 9/25/2020 EPFL 6 S - Gio spring 2000 47

Backup slides 9/25/2020 EPFL 6 S - Gio spring 2000 47

Status • Definition of architecture for Megaprogramming – bottom up assessment of code to

Status • Definition of architecture for Megaprogramming – bottom up assessment of code to be generated – examples: room reservation, shipping – primitives – handles for parallel operation – heterogeneity -- common features of distribution protocols • Minimal language that can generate the code – no versus very few types -- ASN. 1 for complex types – natural parallelism -- still a major research issue • Awareness of novel optimizations – information flow constraints -- scheduling – direct data flow between megamodules 9/25/2020 EPFL 6 S - Gio spring 2000 48

Focus for Future • Augment CHAIMS compiler to generate multiple feasible and effective paths

Focus for Future • Augment CHAIMS compiler to generate multiple feasible and effective paths for execution • Create CHAIMS interpreter to complement compiler and execute scheduling decisions. • Dynamic scheduling of invocations and extractions. • Flexible interaction with megamodules; extracting and handling overview results. • Direct dataflows between megamodules (planned). 9/25/2020 EPFL 6 S - Gio spring 2000 49

Composition of Processes. . . • versus composition and integration of Data • data-warehouses

Composition of Processes. . . • versus composition and integration of Data • data-warehouses • wrapping data available on web • versus composition of Components • reusing small components via copy/paste or shared libraries locally installed • large distributed components within same “domain” as composition, e. g. within one bank or airline CHAIMS: 9/25/2020 » processed information » composing autonomous execution threads EPFL 6 S - Gio spring 2000 50

CHAIMS “Logical” Architecture Customer Megaprogram clients (in CHAIMS) Network/Transport (DCE, CORBA, . . .

CHAIMS “Logical” Architecture Customer Megaprogram clients (in CHAIMS) Network/Transport (DCE, CORBA, . . . ) Megamodules (Wrapped or Native) 9/25/2020 EPFL 6 S - Gio spring 2000 51

Summary • CHAIMS requires rethinking of many common assumptions – gain understanding via simple

Summary • CHAIMS requires rethinking of many common assumptions – gain understanding via simple examples • Work focused on CALL statement decomposition – to accomplish integration of large services – exploit inherent asynchrony • First version of architecture and language drafts are completed; basic infrastructure partially available (compiler, wrapper templates). • More demos will come soon. Half-way through a four year project, but funding resources were drastically reduced. Þ http: //www-db. stanford. edu/CHAIMS 9/25/2020 EPFL 6 S - Gio spring 2000 52

Long-term Objectives 1 Implementing a system for a simple and purely compositional language hiding

Long-term Objectives 1 Implementing a system for a simple and purely compositional language hiding differences of diverse protocols 2 Automatic optimized scheduling of invocations (taking advantage of inherent parallelism and estimate-capabilities of megamodules, hence splitting up of CALL-statement) 3 Decision-making support (direct) interaction with megamodules, based on overview and incremental results (fixed flow, not yet interactive changes to megaprogram) 4 Automatic dataflow optimization (direct dataflows 53 9/25/2020 EPFL 6 S - Gio spring 2000

Assumptions, Additional Constraints • Heterogenous legacy modules ==> wrapping of modules, mixing protocols on

Assumptions, Additional Constraints • Heterogenous legacy modules ==> wrapping of modules, mixing protocols on client side or in wrappers. • Parallelism of megamodule-methods not through multithreading on client side but through splitting up CALL-statement (==> sequential program on client side); this leads to useful parallelism because we deal with coarse-grain parallelism. • CHAIMS-compliancy for megamodules is achieved by wrappertemplates, for new native megamodules as well as for legacy ones (CHAIMS-compliancy is more than just knowing CHAIMS-protocol!). • No reliance on existence of one specific higher level protocol like CORBA, DCOM, RMI ==> implementing an independent data-encoding and marshalling with ASN. 1, instead of using one of them and then having converters in the wrappers. • Interfaces of megamodules match <==> no investigation into opaque datablobs on client side necessary. • Thin client, client should. EPFL 6 S be able tospring run anywhere (not quite fulfilled 54 9/25/2020 - Gio 2000

Non- (not yet)-Objectives • No commercial product. • No specific controls over ilities (security,

Non- (not yet)-Objectives • No commercial product. • No specific controls over ilities (security, name-serving, etc. ) that they are normally present in distributed systems. • No sophisticated front-end, no graphical programming/composition, no browser for repository, no higherlevel language as input (not yet). • Not solving all problems of megamodule composition that are mentioned in the various CHAIMS-papers (e. g. differing ontologies, non-matching interfaces of megamodules), only the ones mentioned in objectives and additional conditions. 9/25/2020 EPFL 6 S - Gio spring 2000 55

Proposed Changes to Architecture: Other Approach to Heterogeneity client site Client (megaprogram) TCP/IP sockets

Proposed Changes to Architecture: Other Approach to Heterogeneity client site Client (megaprogram) TCP/IP sockets CHAIMS protocol sites of servers different wrapper site 9/25/2020 RMI wrapper CORBA wrapper RMI wrapper CORBA RMI server-specific protocols native server 1 native server 2 native server 3 EPFL 6 S - Gio spring 2000 chaims compliant module chaims I/O module 56

Open Source for Composition? • Gnutella 9/25/2020 EPFL 6 S - Gio spring 2000

Open Source for Composition? • Gnutella 9/25/2020 EPFL 6 S - Gio spring 2000 57

Reasons for an Alternative Architecture Overall: • Simpler architecture: fewer wrappers, just one protocol

Reasons for an Alternative Architecture Overall: • Simpler architecture: fewer wrappers, just one protocol on client side Server-side: • No direct linking with legacy code also for CORBA-wrappers, different sites for wrapper and legacy megamodule possible • All native CHAIMS-megamodules will be built using wrapper templates ==> no reason for several protocols, they can all use TCP/IP. • Dataflow-optimization: direct messages between megamodules/their wrappers necessary (without bridges) Client-side: • Thin client that could run everywhere (TCP/IP is available everywhere, but not CORBA or DCE, RMI also is easily available everywhere). • CSRT could be implemented by interpreter instead of compiler, maybe also possible with current architecture, but more complex. • We use just transport-facility (really true? what about native CHAIMS-types like string, integer, boolean? ) of CORBA, RMI, DCE (for data we have ASN. 1); this is already offered by TCP/IP ==> no unnecessary overkill Drawback: missing one of the current funding objectives (heterogeinity on client side). 9/25/2020 EPFL 6 S - Gio spring 2000 58