ServiceOriented Science Scaling Science Services Ian Foster Argonne

  • Slides: 24
Download presentation
Service-Oriented Science Scaling Science Services Ian Foster Argonne National Laboratory University of Chicago Univa

Service-Oriented Science Scaling Science Services Ian Foster Argonne National Laboratory University of Chicago Univa Corporation APAC Conference, September 28, 2005 i. Grid Workshop, September 27, 2005

2 Two Questions l How do we scale the number of scientists benefiting from

2 Two Questions l How do we scale the number of scientists benefiting from computational techniques? l What should be the role of infrastructure providers in enabling this scaling?

3 Computational Science Computation joins theory & experiment as a third mode of scientific

3 Computational Science Computation joins theory & experiment as a third mode of scientific enquiry Program & data PC or Supercomputer C Increasingly sophisticated computational approaches D Monolithic programs, databases u u Inflexible & hard to evolve Mismatch with reality of diverse & distributed teams, resources, & approaches Genbank

4 Decompose over the Network Service-Oriented Architecture l Clients can then integrate dynamically u

4 Decompose over the Network Service-Oriented Architecture l Clients can then integrate dynamically u Select & compose services u Select “best of breed” providers u Publish result as a new service l Need not know implementation details l Note: complements, not replaces, HPC

For Example: Virtual Observatories Surveys Observatories Missions Survey and Mission Archives Sloan vs. 2

For Example: Virtual Observatories Surveys Observatories Missions Survey and Mission Archives Sloan vs. 2 MASS Digital libraries Numerical Sim’s Brown dwarf candidates 5

6 Having Decomposed, Integrate l For example u l Registries u Value-added services u

6 Having Decomposed, Integrate l For example u l Registries u Value-added services u Workflows Issues u Description u Discovery u Composition u Adaptation & evolution u Users Discovery tools Analysis tools Data Archives Qualities of service: security, performance, reliability, …

Example Value Added Service: PUMA 7 PUMA Knowledge Base Information about proteins analyzed against

Example Value Added Service: PUMA 7 PUMA Knowledge Base Information about proteins analyzed against ~2 million gene sequences Analysis on Grid Natalia Maltsev et al. Involves millions of BLAST, BLOCKS, and other processes

8 SOA= Silo-Oriented Architecture? l What about dynamic behaviors? u u l Time-varying load

8 SOA= Silo-Oriented Architecture? l What about dynamic behaviors? u u l Time-varying load Dynamically instantiated services What about operating costs? u u Week 6 Software deployment & maintenance Security & other concerns Services Operating or? 7 8

We Need to Decompose in Two Dimensions Horizontal 9

We Need to Decompose in Two Dimensions Horizontal 9

We Need to Decompose in Two Dimensions Vertical Horizontal 10

We Need to Decompose in Two Dimensions Vertical Horizontal 10

11 Decomposition Enables On-Demand Provisioning l l Separate production & consumption l Issues u

11 Decomposition Enables On-Demand Provisioning l l Separate production & consumption l Issues u Discovery u Composition u Qualities of service IPC Server Globus Deliver to services 2 IPC Dispatcher l Aggregate resources Provision New Worker Process IPC Dispatcher SAP Globus. World Demo IPC = Internet Pricing Configurator

The Globus-Based LIGO Data Grid 12 LIGO Gravitational Wave Observatory Birmingham • §Cardiff AEI/Golm

The Globus-Based LIGO Data Grid 12 LIGO Gravitational Wave Observatory Birmingham • §Cardiff AEI/Golm Replicating >1 Terabyte/day to 8 sites >30 million replicas so far MTBF = 1 month www. globus. org/solutions

13 Decomposition Enables Separation of Concerns & Roles S 1 User D S 3

13 Decomposition Enables Separation of Concerns & Roles S 1 User D S 3 “Provide access to data D at S 1, S 2, S 3 with performance P” Service Provider “Provide storage with performance P 1, network with P 2, …” Resource Provider S 2 S 1 D S 2 S 3 Replica catalog, User-level multicast, … S 1 D S 2 S 3

14 Scaling Up “Sometimes through heroism you can make something work. However, understanding why

14 Scaling Up “Sometimes through heroism you can make something work. However, understanding why it worked, abstracting it, making it a primitive is the key to getting to the next order of magnitude of scale. ” Robert Calderbank We want to scale the number, robustness, & performance of services

Identifying Primitives: (1) Taking Services Seriously l Model the world as a collection of

Identifying Primitives: (1) Taking Services Seriously l Model the world as a collection of services u Computations, computers, instruments, storage, data, communities, agreements, … l Focus on what these things have in common l E. g. , lifecycle management u l 15 Negotiation, deployment/creation, modeling, monitoring, management, termination E. g. , security u Authentication, authorization, audit, … Web Services-based Grid infrastructure I. Foster, S. Tuecke, Describing the Elephant: The Many Faces of IT as Service, ACM Queue, 2005

Identifying Primitives: (2) Interface Specifications 16 Applications of the framework (Compute, network, storage provisioning,

Identifying Primitives: (2) Interface Specifications 16 Applications of the framework (Compute, network, storage provisioning, job reservation & submission, data management, application service Qo. S, …) WS-Agreement (Agreement negotiation) WS Distributed Management (Lifecycle, monitoring, …) WS-Resource Framework & WS-Notification* (Resource identity, lifetime, inspection, subscription, …) Web services (WSDL, SOAP, WS-Security, WS-Reliable. Messaging, …) *WS-Transfer, WS-Enumeration, WS-Eventing, WS-Management define similar functions Foster, Czajkowski, Frey, et al. , From OGSI to WSRF, Proc. IEEE, 93(3). 604 -612. 2005

Identifying Primitives: (3) Open Source Implementation 17 www. globus. org Data Replication Credential Mgmt

Identifying Primitives: (3) Open Source Implementation 17 www. globus. org Data Replication Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework Web. MDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization Grid. FTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime I. Foster, Globus Toolkit Version 4: Software for Service-Oriented Systems, LNCS 3779, 2 -13, 2005

Open Science Grid 18 Ø 50 sites (15, 000 CPUs) & growing Ø 400

Open Science Grid 18 Ø 50 sites (15, 000 CPUs) & growing Ø 400 to >1000 concurrent jobs Ø Many applications + CS experiments; includes long-running production operations Ø Up since October 2003; few FTEs central ops Jobs (2004) www. opensciencegrid. org

19 Virtual OSG Clusters OSG cluster Xen hypervisors Tera. Grid cluster

19 Virtual OSG Clusters OSG cluster Xen hypervisors Tera. Grid cluster

20 Dynamic Service Deployment Community A • Community scheduling logic • Data distribution •

20 Dynamic Service Deployment Community A • Community scheduling logic • Data distribution • Community management • Science services • . . . … Community Z Requirements: • Community control • Persistence • Resource guarantees • Noninterference

21 Summary l How do we scale the number of scientists benefiting from computational

21 Summary l How do we scale the number of scientists benefiting from computational techniques? Construct powerful science services Simplify construction by decomposing roles: content, function, resource l What should be the role of infrastructure providers in enabling this scaling? Service providers for communities wanting to deliver content Resource providers for service providers wanting to deliver services

22 Service-Oriented Science: Scaling by Separating Concerns Hosted by Enabled by Domain-dependent Content Expt

22 Service-Oriented Science: Scaling by Separating Concerns Hosted by Enabled by Domain-dependent Content Expt design Expt output Telepresence monitor Function Domain-independent Simulation code Electronic notebook Portal server Simulation code Simulation server Metadata catalog Certificate authority Data archive Resources Experimental apparatus Servers, storage, networks I. Foster, Service-Oriented Science, 308, May 6, 2005

23 Acknowledgments l NSF, DOE, NASA, IBM for financial support l Numerous fine colleagues

23 Acknowledgments l NSF, DOE, NASA, IBM for financial support l Numerous fine colleagues at Argonne, U. Chicago, USC/ISI, and elsewhere l In particular: Carl Kesselman Steve Tuecke Kate Keahey & Bill Allcock, Ann Chervenak, Ewa Deelman, Jennifer Schopf, Mike Wilde

24 For More Information l Globus Alliance: www. globus. org l Papers: www. mcs.

24 For More Information l Globus Alliance: www. globus. org l Papers: www. mcs. anl. gov/~foster For those at APAC: Globus Toolkit Tutorial (Thursday, Friday) For those at IGrid: Carl Kesselman’s Master Class (Thursday)