Web Services and Grid Architecture and their application

Web Services and Grid Architecture and their application to Earthquake Science USC AIST Meeting August 31 2004 Geoffrey Fox Community Grids Lab Indiana University gcf@indiana. edu

Philosophy of Web Service Grids • Much of Distributed Computing was built by natural extensions of computing models developed for sequential machines • This leads to the distributed object (DO) model represented by Java and CORBA – RPC (Remote Procedure Call) or RMI (Remote Method Invocation) for Java • Key people think this is not a good idea as it scales badly and ties distributed entities together too tightly – Distributed Objects Replaced by Services • Note CORBA was considered too complicated in both organization and proposed infrastructure – and Java was considered as “tightly coupled to Sun” – So there were other reasons to discard • Thus replace distributed objects by services connected by “one-way” messages and not by request-response messages

Web services • Web Services build loosely-coupled, distributed applications, based on the SOA principles. • Web Services interact by exchanging messages in SOAP format • The contracts for the message exchanges that implement those interactions are described via WSDL interfaces.

What is a Grid? • You won’t find a clear description of what is Grid and how does differ from a collection of Web Services – I see no essential reason that Grid Services have different requirements than Web Services – Geoffrey Fox, David Walker, e-Science Gap Analysis, June 30 2003. Report UKe. S-2003 -01, http: //www. nesc. ac. uk/technical_papers/UKe. S-2003 -01/index. html. – Notice “service-building model” is like programming language – very personal! • Grids were once defined as “Internet Scale Distributed Computing” but this isn’t good as Grids depend as much if not more on data as well as simulations • So Grids can be termed “Internet Scale Distributed Services” and represent a way of collecting services together to solve problems where special features and quality of service needed.

e-Infrastructure e-Infrastructure builds on the inevitable increasing performance of networks and computers linking them together to support new flexible linkages between computers, data systems and people • Grids and peer-to-peer networks are the technologies that build e-Infrastructure • e-Infrastructure called Cyber. Infrastructure in USA We imagine a sea of conventional local or global connections supported by the “ordinary Internet” • Phones, web page accesses, plane trips, hallway conversations • Conventional Internet technology manages billions of broadcast or low (one client to Server) or broadcast links On this we superimpose high value multi-way organizations (linkages) supported by Grids with optimized resources and system support • Low multiplicity fully interactive real-time sessions • Resources such as databases supporting (larger) communities

N plus N Community Resources Grid Community databases have analogy to Television and the News Web that allow individuals to communicate instantly with each other via Web Pages and Headline News acting as proxies N resources deposit information and N can view Call N plus N

Large and Small Grids N resources in a community (N is billions for the world and 10000 for many scientific fields) Communities are arranged hierarchically with real work being done in “groups” of M resources – M could be 10 -100 in e-Science Metcalfe’s law: value of network grows like square of number of nodes M – we call Grids where this true Metcalfe or M 2 Grids Nature of Interaction depends on size of M or N • N plus N Shared Information Grids for largish N • M 2 Metcalfe Grids for smaller M < N Technology support depends on M/N – might use a relatively static DHT (Distributed Hash Table) for large N and a distributed shared memory for small M Grids must merge with peer-to-peer networks to support both N plus N and M 2 Systems

M 2 Interactions • Superimpose M way “Grids” on the sea (heatbath) of “ 2 by N” or N plus N “ordinary” interactions Grids also support many community N plus N resources Implement Grids as a software overlay network

Information Complexity I Consider a community of N resources with groups of size M with each group complexity C • N/M Groups Information in systems varies from coherent (harmonious) to incoherent limits • Web and Grid data resources supply coherence as in curated astronomy, bioinformatics, geophysics database • Can consider N plus N Grids as Coherent or Harmonious Grids I = (NM)0. 5. (C/M) Incoherent to N. (C/M) Coherent In this language Grids do one or both of • Coherence/Harmony – common shared asynchronous resources • Interactivity – Increase complexity to M 2 with real-time linkage of interacting resources

Information Complexity II N plus N Community database has I = N Coherent • Improving on N 0. 5 incoherent case Nearest Neighbor groups is I = (NM)0. 5 • Becoming I = N in limit M = N • M is correlation length in Complex Systems approach M-ary Interactive group (M 2 Metcalfe Grids) has C = M 2 and I = (NM 3)0. 5 Incoherent to I = NM Coherent • Coherent case most natural in science due to synergy between Metcalfe and Coherence Grids “Small World (logarithmic) networks” and hierarchical group structure require more discussion

Architecture of (Web Service) Grids built from Web Services communicating through an overlay network built in SOFTWARE on the “ordinary internet” at the application level Grids provide the special quality of service (security, performance, fault-tolerance) and customized services needed for “distributed complex enterprises” We need to work with Web Service community as they debate the 60 or so proposed Web Service specifications • • Use Web Service Interoperability WS-I as “best practice” Must add further specifications to support high performance Database “Grid Services” for N plus N case Streaming support for M 2 case

Importance of SOAP • SOAP defines a very obvious message structure with a header and a body • The header contains information used by the “Internet operating system” – Destination, Source, Routing, Context, Sequence Number … • The message body is only used by the application and will never be looked at by “operating system” except to encrypt, compress it etc. • Much discussion in field revolves around what is in header! – e. g. WSRF adds a lot to header

Web Services • Java is very powerful partly due to its many “frameworks” that generalize libraries e. g. – Java Media Framework – Java Database Connectivity JDBC • Web Services have a correspondingly collections of specifications that represent critical features of the distributed operating systems for “Grids of Simple Services” – Some 60 active WS-* specifications for areas such as – a. Core Infrastructure Specifications – b. Service Discovery – c. Security – d. Messaging – e. Notification – f. Workflow and Coordination – g. Characteristics – h. Metadata and State – i. User Interfaces

A List of Web Services I • a) Core Service Architecture • XSD XML Schema (W 3 C Recommendation) V 1. 0 February 1998, V 1. 1 February 2004 • WSDL 1. 1 Web Services Description Language Version 1. 1, (W 3 C note) March 2001 • WSDL 2. 0 Web Services Description Language Version 2. 0, (W 3 C under development) March 2004 • SOAP 1. 1 (W 3 C Note) V 1. 1 Note May 2000 • SOAP 1. 2 (W 3 C Recommendation) June 24 2003 • b) Service Discovery • UDDI (Broadly Supported OASIS Standard) V 3 August 2003 • WS-Discovery Web services Dynamic Discovery (Microsoft, BEA, Intel …) February 2004 • WS-IL Web Services Inspection Language, (IBM, Microsoft) November 2001

A List of Web Services II • c) Security • SAML Security Assertion Markup Language (OASIS) V 1. 1 May 2004 • XACML e. Xtensible Access Control Markup Language (OASIS) V 1. 0 February 2003 • WS-Security 2004 Web Services Security: SOAP Message Security (OASIS) Standard March 2004 • WS-Security. Policy Web Services Security Policy (IBM, Microsoft, RSA, Verisign) Draft December 2002 • WS-Trust Web Services Trust Language (BEA, IBM, Microsoft, RSA, Verisign …) May 2004 • WS-Secure. Conversation Web Services Secure Conversation Language (BEA, IBM, Microsoft, RSA, Verisign …) May 2004 • WS-Federation Web Services Federation Language (BEA, IBM, Microsoft, RSA, Verisign) July 2003

A List of Web Services III • d) Messaging • WS-Addressing Web Services Addressing (BEA, IBM, Microsoft) March 2004 • WS-Message. Delivery Web Services Message Delivery (W 3 C Submission by Oracle, Sun. . ) April 2004 • WS-Routing Web Services Routing Protocol (Microsoft) October 2001 • WS-RM Web Services Reliable Messaging (BEA, IBM, Microsoft, Tibco) v 0. 992 March 2004 • WS-Reliability Web Services Reliable Messaging (OASIS Web Services Reliable Messaging TC) March 2004 • SOAP MOTM SOAP Message Transmission Optimization Mechanism (W 3 C) June 2004 • e) Notification • WS-Eventing Web Services Eventing (BEA, Microsoft, TIBCO) January 2004 • WS-Notification Framework for Web Services Notification with WSTopics, WS-Base. Notification, and WS-Brokered. Notification (OASIS) OASIS Web Services Notification TC Set up March 2004 • JMS Java Message Service V 1. 1 March 2002

A List of Web Services IV • f) Coordination and Workflow, Transactions and Contextualization • WS-CAF Web Services Composite Application Framework including WS-CTX, WS-CF and WS-TXM below (OASIS Web Services Composite Application Framework TC) July 2003 • WS-CTX Web Services Context (OASIS Web Services Composite Application Framework TC) V 1. 0 July 2003 • WS-CF Web Services Coordination Framework (OASIS Web Services Composite Application Framework TC) V 1. 0 July 2003 • WS-TXM Web Services Transaction Management (OASIS Web Services Composite Application Framework TC) V 1. 0 July 2003 • WS-Coordination Web Services Coordination (BEA, IBM, Microsoft) September 2003 • WS-Atomic. Transaction Web Services Atomic Transaction (BEA, IBM, Microsoft) September 2003 • WS-Business. Activity Web Services Business Activity Framework (BEA, IBM, Microsoft) January 2004 • BTP Business Transaction Protocol (OASIS) May 2002 with V 1. 0. 9. 1 May 2004 • BPEL Business Process Execution Language for Web Services (OASIS) V 1. 1 May 2003 • WS-Choreography (W 3 C) V 1. 0 Working Draft April 2004 • WSCI (W 3 C) Web Service Choreography Interface V 1. 0 (W 3 C Note from BEA, Intalio, SAP, Sun, Yahoo) • WSCL Web Services Conversation Language (W 3 C Note) HP March 2002

A List of Web Services V • h) Metadata and State • RDF Resource Description Framework (W 3 C) Set of recommendations expanded from original February 1999 standard • DAML+OIL combining DAML (Darpa Agent Markup Language) and OIL (Ontology Inference Layer) (W 3 C) Note December 2001 • OWL Web Ontology Language (W 3 C) Recommendation February 2004 • WS-Distributed. Management Web Services Distributed Management Framework with MUWS and MOWS below (OASIS) • WSDM-MUWS Web Services Distributed Management: Management Using Web Services (OASIS) V 0. 5 Committee Draft April 2004 • WSDM-MOWS Web Services Distributed Management: Management of Web Services (OASIS) V 0. 5 Committee Draft April 2004 • WS-Metadata. Exchange Web Services Metadata Exchange (BEA, IBM, Microsoft, SAP) March 2004 • WS-RF Web Services Resource Framework including WS-Resource. Properties, WS-Resource. Lifetime, WS-Renewable. References, WS-Service. Group, and WS -Base. Faults (OASIS) Oasis TC set up April 2004 and V 1. 1 Framework March 2004 • ASAP Asynchronous Service Access Protocol (OASIS) with V 1. 0 working draft G June 2004 • WS-GAF Web Service Grid Application Framework (Arjuna, Newcastle University) August 2003

A List of Web Services VI • g) General Service Characteristics • WS-Policy Web Services Policy Framework (BEA, IBM, Microsoft, SAP) May 2003 • WS-Policy. Assertions Web Services Policy Assertions Language (BEA, IBM, Microsoft, SAP) May 2003 • WS-Agreement Web Services Agreement Specification (GGF under development) May 2004 • i) User Interfaces • WSRP Web Services for Remote Portlets (OASIS) OASIS Standard August 2003 • JSR 168: JSR-000168 Portlet Specification for Java binding (Java Community Process) October 2003

WS-I Interoperability • Critical underpinning of Grids and Web Services is the gradually growing set of specifications in the Web Service Interoperability Profiles • Web Services Interoperability (WS-I) Interoperability Profile 1. 0 a. " http: //www. ws-i. org. gives us XSD, WSDL 1. 1, SOAP 1. 1, UDDI in basic profile and parts of WS-Security in their first security profile. • We imagine the “ 60 Specifications” being checked out and evolved in the cauldron of the real world and occasionally best practice identifies a new specification to be added to WS-I which gradually increases in scope – Note only 4. 5 out of 60 specifications have “made it” in this definition

Web Services Grids and WS-I+ • WS-I Interoperability doesn’t cover all the capabilities need to support Grids • WS-I+ is designed to minimal extension of WS-I to support “most current” Grids: it adds support for – Enhanced SOAP Addressing (WS-Addressing) – Fault tolerant (reliable) messaging – Workflow as in IBM-Microsoft standard BPEL • Security and Notification best practice and support will probably get added soon – There are Web Service frameworks here but various IBM v Microsoft v Globus differences to be resolved • Portlet-based User Interfaces could be added • UK OMII Open Middleware Infrastructure Institute is adopting this approach to support UK e-Science program – Currently UK e-Science largely either uses GT 2 (as in EDG) or Simple Web Services for “database Grids” – http: //www. omii. ac. uk/

Application Specific Grids Generally Useful Services and Grids Workflow WSFL/BPEL Service Management (“Context etc. ”) Service Discovery (UDDI) / Information Service Internet Transport Protocol Service Interfaces WSDL Base Hosting Environment Protocol HTTP FTP DNS … Presentation XDR … Session SSH … Transport TCP UDP … Network IP … Data Link / Physical Higher Level Services Service Context Service Internet Bit level Internet (OSI Stack) Layered Architecture for Web Services and Grids

How SERVOGrid Fits In • There is core Web Services – the “operating system of the world” – controlled by WS-* • There is workflow – programming the Grid • There are very general Web Services and Grids such as – Database – Collaboration – Job Submittal • There are some relatively general services and Grids such as – Visualization – GIS • There application specific services such as – Virtual California

Layers of Onion Application (level 1 Programming) Application Semantics (Metadata, Ontology) Level 2 “Programming” Systems Metadata (Context, State) Basic WS-* Infrastructure Web Service 1 WS 2 WS 3 WS 4 Workflow (level 3) Programming All SERVOGrid capabilities are built as Web Services with this structure 3 level programming model

Working up from the Bottom We have the classic (CISCO, Juniper …. ) Internet routing the flood of ordinary packets in OSI stack architecture Web Services build the “Service Internet” or IOI (Internet on Internet) with • Routing via WS-Addressing not IP header • Fault Tolerance (WS-RM not TCP) • Security (WS-Security/Secure. Conversation not IPSec/SSL) • Information Services (UDDI/WS-Context not DNS/Configuration files) • At message/web service level and not packet/IP address level Software-based Service Internet possible as computers “fast” Familiar from Peer-to-peer networks and built as a software overlay network defining Grid (analogy is VPN) SOAP Header contains all information needed for the “Service Internet” (Grid Operating System) with SOAP Body containing information for Grid application service

Consequences of Rule of the Millisecond • Useful to remember critical time scales – 1) 0. 000001 ms – CPU does a calculation – 2) 0. 001 to 0. 01 ms – MPI latency – 3) 1 to 10 ms – wake-up a thread or process – 4) 10 to 1000 ms – Internet delay • 4) implies geographically distributed metacomputing can’t in general compete with parallel systems (OK for some cases) • 3) << 4) implies RPC not a critical programming abstraction as it ties distributed entities together and gains a time that is typically only 1% of inevitable network delay – However many service interactions are at their heart RPC but implemented differently at times e. g. asynchronously • 2) says MPI is not relevant for a distributed environment as low latency cannot be exploited • Even more serious than using RMI/RPC, current Object paradigms also lead to mixed up services with unclear boundaries and autonomy • Web Services are only interesting model for services today

Linking Modules Closely coupled Java/Python … Module B Module A Method Calls. 001 to 1 millisecond Coarse Grain Service Model Service B Messages Service A 0. 1 to 1000 millisecond latency From method based to RPC to message based to event-based “Listener” Subscribe to Events Service B Publisher Post Events Message Queue in the Sky Service A

What is a Simple Service? • Take any system – it has multiple functionalities – We can implement each functionality as an independent distributed service – Or we can bundle multiple functionalities in a single service • Whether functionality is an independent service or one of many method calls into a “glob of software”, we can always make them as Web services by converting interface to WSDL • Simple services are gotten by taking functionalities and making as small as possible subject to “rule of millisecond” – Distributed services incur messaging overhead of one (local) to 100’s (far apart) of milliseconds to use message rather than method call – Use scripting or compiled integration of functionalities ONLY when require <1 millisecond interaction latency • Apache web site has many projects that are multiple functionalities presented as (Java) globs and NOT (Java) Simple Services – Makes it hard to integrate sharing common security, user profile, file access. . services

• • • Grids of Simple Services Link via methods messages streams Services and Grids are linked by messages Internally to service, functionalities are linked by methods A simple service is the smallest Grid We are familiar with method-linked hierarchy Lines of Code Methods Objects Programs Packages Methods CPUs Services Clusters MPPs Databases Sensor Federated Databases Sensor Nets Component Grids Compute Resource Grids Data Resource Grids Overlay and Compose Grids of Grids

Component Grids? • So we build collections of Web Services which we package as component Grids – Visualization Grid – Sensor Grid – Utility Computing Grid – Person (Community) Grid – Earthquake Simulation Grid – Control Room Grid – Crisis Management Grid • We build bigger Grids by composing component Grids using the Service Internet

Flood CIGrid … Electricity CIGrid … Flood Services and Filters Collaboration Grid Sensor Grid Registry Security Portals GIS Grid Data Access/Storage Core Grid Services Notification Workflow Gas CIGrid Gas Services and Filters Visualization Grid Compute Grid Metadata Messaging Physical Network Critical Infrastructure (CI) Grids built as Grids of Grids

Repositories Federated Databases Database Sensors Streaming Data Field Trip Database Sensor Grid Database Grid Research Compute Grid Data Filter Services Research Simulations SERVOGrid ? GIS Discovery Grid Services Analysis and Visualization Portal Geoscience Research and Education Grids Education Customization Services From Research to Education Grid Computer Farm

IOI and CIE • Let us study the two layers IOI (Service Internet On the Bit Internet) and CIE (Context and Information Environment) • IOI is most “straightforward” as it is providing reasonably well understood capabilities at a new “level” • CIE is roughly the inter-service “shared memory” used to manage and control them at “distributed operating system level – Critical is “shared” (a database service) versus message based CIE Application Specific Grids Generally Useful Services and Grids Workflow WSFL/BPEL Service Management (“Context etc. ”) Service Discovery (UDDI) / Information Service Internet Transport Protocol Service Interfaces WSDL Higher Level Services CIE IOI

Narada. Brokering Computer Minicomputer Audio/Video Conferencing Client Server Modem Web Service B Peers Narada. Brokering Broker Network Queues Firewall Stream Server-enhanced Messaging Workstation Laptop computer Peers PDA Audio/Video Conferencing Client NB supports messages and streams

Narada. Brokering and IOI • “Software Overlay Network” features • Support for Multiple Transport protocols • Support for multiple delivery mechanisms – Reliable Delivery – Exactly-once Delivery – Ordered Delivery – Optional Delivery optimization modules for different modes • Compression/Decompression of payloads with optional module • Coalescing/Fragmentation of payloads with optional module • NTP Time Service • Security Service • Performance Monitoring • Performance optimized routing with optional module • Support for WS-Reliability, WS-Reliable. Messaging and their Federation

Virtualizing Communication specified in terms of user goal and Quality of Service – not in choice of port number and protocol Bit Internet Protocols have become overloaded e. g. MUST use UDP for A/V latency requirements but CAN’t use UDP as firewall will not support ……… A given “Service Internet” communication can involve multiple transport protocols and multiple destinations – the latter possibly determined dynamically NB Brokers A Satellite UDP Firewall HTTP Software Multicast NB Broker Client Filtering Fast Link B 1 Hand-Held Protocol Dial-up Filter B 2 B 3

Performance Monitoring Every broker incorporates a Monitoring service that monitors links originating from the node. Every link measures and exposes a set of metrics • Average delays, jitters, loss rates, throughput. Individual links can disable measurements for individual or the entire set of metrics. Measurement intervals can also be varied Monitoring Service, returns measured metrics to Performance Aggregator.

Narada. Brokering Service Integration Proxy Messaging Handler Messaging S 1 P 2 S 1 S 2 Notification S 1 S? Service NB Transport S 2 P? Proxy Any Transport Standard SOAP Transport Internal to Service: SOAP Handlers/Extensions/Plug-ins Java (JAXRPC). NET Indigo and special cases: PDA's g. SOAP, Axis C++

Fast Web Service Communication I • IOI Application level Internet allows one to optimize message streams at the cost of “startup time”, Web Services can deliver the fastest possible interconnections with or without reliable messaging • Typical results from Grossman (UIC) comparing Slow SOAP over TCP with binary and UDP transport (latter gains a factor of 1000) Pure SOAP SOAP over UDP Binary over UDP 7020 5. 60

Fast Web Service Communication II • Mechanism only works for streams – sets of related messages • SOAP header in streams is constant except for sequence number (Message ID), time-stamp. . • One needs two types of new Web Service Specification • “WS-Stream. Negotiation” to define how one can use WS-Policy to send messages at start of a stream to define the methodology for treating remaining messages in stream • “WS-Flexible. Representation” to define new encodings of messages

Fast Web Service Communication III • Then use “WS-Stream. Negotiation” to negotiate stream in Tortoise SOAP – ASCII XML over HTTP and TCP – – Deposit basic SOAP header through connection – it is part of context for stream (linking of 2 services) – Agree on firewall penetration, reliability mechanism, binary representation and fast transport protocol – Naturally transport UDP plus WS-RM • Use “WS-Flexible. Representation” to define encoding of a Fast transport (On a different port) with messages just having “Flexible. Representation. Context. Token”, Sequence Number, Time stamp if needed – RTP packets have essentially this structure – Could add stream termination status • Can monitor and control with original negotiation stream • Can generate different streams optimized for different end-points

CIE: Common Service Information and Metadata • Consider a collection of services working together – Workflow tells you how to specify service interaction but more basically there is shared information or context specifying/controlling collection • WS-RF and WS-GAF have different approaches to contextualization – supplying a common “context” which at its simplest is a token to represent state • More generally core shared information includes dynamic service metadata and the equivalent of configuration information. • One can supports such a common context either as pool of messages or as message-based access to a “database” (Context Service) • Two services linked by a stream are perhaps simplest example of a collection of services needing context • Note that there is a tension between storing metadata in messages and services. – This is shared versus distributed memory debate in parallel computing

Four Metadata Architectures System or Federated Registry or Metadata Catalog Database Grid or Domain Specific Metadata Catalogs Database 1 Database 2 Database 3 Web Service Ports SDE 1 SDE 2 SDE 1 SDE 2 Service Service Individual Services M M M Messages M M M

Notification Architecture • Point-to-Point Service B Publish Subscribe Service A • Or Brokered Subscribe Service B Broker Queues Messages Supports creation and subscription of topics Publish Service A • Narada. Brokering will support both WS-Eventing and WS-Notification as well as Java Message Service JMS that is Java Notification standard

Architecture Characteristics • Build as Component Web Services Grids – – Simulation Visualization GIS Database • Narada. Brokering provides – – Fault Tolerance Support for High Performance Streams Basic Dynamic Information Environment Notification • HPSearch provides – More flexible information environment with scripting – Can prototype simple workflow before implementing in BPEL • Scripted Information Environment plus Workflow supports Complexity (multi-scale iterations)
- Slides: 45