Grid Services Overview Introduction Ian Foster Argonne National

  • Slides: 37
Download presentation
Grid Services Overview & Introduction Ian Foster Argonne National Laboratory University of Chicago Univa

Grid Services Overview & Introduction Ian Foster Argonne National Laboratory University of Chicago Univa Corporation OOSTech, Baltimore, October 26, 2005

What’s This About “Grid Services”? l I will describe Web service interfaces that implement

What’s This About “Grid Services”? l I will describe Web service interfaces that implement useful behaviors u u l Services: program execution, data movement, data access, … I will also describe open source software that implements those interfaces u l Primitives: resources, state, security In particular, Globus Toolkit (GT 4) This is all standard Web services! u “Grid is a use case for Web services, focused on resource management” 2

What Grid is About: Aggregation in Virtual Organizations • • Distributed resources and people

What Grid is About: Aggregation in Virtual Organizations • • Distributed resources and people Linked by networks, crossing admin domains Sharing resources, common goals Dynamic behaviors R R R VO-A R VO-B 3

What Grid is About: Aggregation in Virtual Organizations • • • Distributed resources and

What Grid is About: Aggregation in Virtual Organizations • • • Distributed resources and people Linked by networks, crossing admin domains Sharing resources, common goals Dynamic behaviors Fault tolerant R R R VO-A R VO-B 4

Grid Technology: Take Services Seriously l Model the world as a collection of services

Grid Technology: Take Services Seriously l Model the world as a collection of services u l Computations, computers, instruments, storage, data, communities, agreements, … Focus on what these things have in common u u E. g. , state modeling & lifecycle: Negotiation, deployment/creation, modeling, monitoring, management, termination E. g. , security: Authentication, authorization, audit, … Result is Grid infrastructure u Using Web services as a platform 5

6 “Stateless” vs. “Stateful” Services File. Transfer Service move l Client Without state, how

6 “Stateless” vs. “Stateful” Services File. Transfer Service move l Client Without state, how does client: u u l move (A to B) Determine what happened (success/failure)? Find out how many files completed? Receive updates when interesting events arise? Terminate a request? Few useful services are truly “stateless”, but WS interfaces alone do not provide built-in support for state

7 File. Transfer. Service (without WSRF) File. Transfer Service move (A to B) :

7 File. Transfer. Service (without WSRF) File. Transfer Service move (A to B) : transfer. ID Client what. Happen state tell. Me. When cancel l Developer reinvents wheel for each new service u u u Custom management and identification of state: transfer. ID Custom operations to inspect state synchronously (what. Happen) and asynchronously (tell. Me. When) Custom lifetime operation (cancel)

8 WSRF in a Nutshell l l Service EPR EPR u Get. RP Get.

8 WSRF in a Nutshell l l Service EPR EPR u Get. RP Get. Mult. RPs Resource Set. RP Query. RPs Service State representation u l State identification u l u l Set. Termination. Time Immediate. Destruction Notification Interfaces u u l Get. RP, Query. RPs, Get. Multiple. RPs, Set. RP Lifetime Interfaces u Destroy Endpoint Reference State Interfaces Subscribe Set. Term. Time Resource Property Subscribe Notify Service. Groups

9 File. Transfer. Service (w/ WSRF) File. Transfer. Service create. Resource Transfer get. RP

9 File. Transfer. Service (w/ WSRF) File. Transfer. Service create. Resource Transfer get. RP RPs query. RPs create. Resource (A to B) : EPR Client destroy l Developer specifies custom method to create. Resource and leaves the rest to WSRF standards: u u u State exposed as Resource + Resource Properties and identified by Endpoint Reference (EPR) State inspected by standard interfaces (Get. RP, Query. RPs) Lifetime management by standard interfaces (Destroy)

Grid Infrastructure: Open Standards 10 Applications of the framework (Compute, network, storage provisioning, job

Grid Infrastructure: Open Standards 10 Applications of the framework (Compute, network, storage provisioning, job reservation & submission, data management, application service Qo. S, …) WS-Agreement (Agreement negotiation) WS Distributed Management (Lifecycle, monitoring, …) WS-Resource Framework & WS-Notification* (Resource identity, lifetime, inspection, subscription, …) Web services (WSDL, SOAP, WS-Security, WS-Reliable. Messaging, …) *WS-Transfer, WS-Enumeration, WS-Eventing, WS-Management define similar functions

11 Globus Toolkit: Open Source Grid Infrastructure Data Replication Globus Toolkit v 4 www.

11 Globus Toolkit: Open Source Grid Infrastructure Data Replication Globus Toolkit v 4 www. globus. org Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework Web. MDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization Grid. FTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime Tools for building WSRF services

12 GT 4 WS Core in a Nutshell Service EPR EPR Get. RP Get.

12 GT 4 WS Core in a Nutshell Service EPR EPR Get. RP Get. Mult. RPs Resource Set. RP Query. RPs Subscribe Set. Term. Time Destroy Implementation of WSRF: Resources, Endpoint. References, Resource. Properties Operation Providers: pre-build implementations of WSRF operations Notification implementation: Topics, Topic. Set, Embedded Notification Consumer service Implementations of Resources (Reflection. Resource, Persistent. Reflection. Resource) and Resource. Properties (Simple. Resource. Property, Reflection. Resource. Property)

14 GT 4 WS Core in a Nutshell Service Container Service Get. RP Get.

14 GT 4 WS Core in a Nutshell Service Container Service Get. RP Get. Mult. RPs EPR Get. Mult. RPs Set. RP EPR EPRResource Set. RP EPRResource Query. RPs RPs Query. RPs Subscribe Set. Term. Time Resource. Home Destroy Service Container: host multiple services in container; one JVM process …more details: based on AXIS service container, processes SOAP messages, Resource. Context extension.

15 GT 4 WS Core in a Nutshell Service Container Service Get. RP Get.

15 GT 4 WS Core in a Nutshell Service Container Service Get. RP Get. Mult. RPs EPR Get. Mult. RPs Set. RP EPR EPRResource Set. RP EPRResource Query. RPs RPs Query. RPs Subscribe Set. Term. Time Resource. Home Destroy PIP PDP Secure Communication: Transport, Message, Conversation (Transport demonstrates best performance) Configurable Security Policies: Policy Information Points (PIPs), Policy Decision Points (PDP) -- chained Example authorization PDPs: Grid. Map, SAML implementations, XACML policies

16 GT 4 WS Core in a Nutshell Service Container Service PIP Get. RP

16 GT 4 WS Core in a Nutshell Service Container Service PIP Get. RP Get. Mult. RPs EPR Get. Mult. RPs Set. RP EPR EPRResource Set. RP EPRResource Query. RPs RPs Query. RPs Subscribe Set. Term. Time Resource. Home Destroy Work. Manager DB Conn Pool PDP JNDI Directory Work. Manager: “thread pool”, site independent “work” manager Apache Database Connection Pool library (JDBC “Data. Source” implementation) JNDI Directory: manages internal, shared objects (Resource. Homes, Work. Manager, Configuration objects, …)

17 GT 4 WS Core in a Nutshell Apache Tomcat Service Container Service PIP

17 GT 4 WS Core in a Nutshell Apache Tomcat Service Container Service PIP Get. RP Get. Mult. RPs EPR Get. Mult. RPs Set. RP EPR EPRResource Set. RP EPRResource Query. RPs RPs Query. RPs Subscribe Set. Term. Time Resource. Home Destroy Work. Manager DB Conn Pool PDP JNDI Directory Deploy Service Container “standalone” or within Apache Tomcat

18 Globus Toolkit: Open Source Grid Infrastructure Data Replication Globus Toolkit v 4 www.

18 Globus Toolkit: Open Source Grid Infrastructure Data Replication Globus Toolkit v 4 www. globus. org Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework Web. MDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization Grid. FTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime

19 GT 4 Security l Public-key-based authentication l Extensible authorization framework based on Web

19 GT 4 Security l Public-key-based authentication l Extensible authorization framework based on Web services standards u SAML-based authorization callout l u Integrated policy decision engine l l As specified in GGF OGSA-Authz WG XACML policy language, per-operation policies, pluggable Credential management service u My. Proxy (One time password support) l Community Authorization Service l Standalone delegation service

20 GT 4 Use of Security Standards Supported, but slow Supported, but insecure Fastest,

20 GT 4 Use of Security Standards Supported, but slow Supported, but insecure Fastest, so default

21 GT-XACML Integration l e. Xtensible Access Control Markup Language u OASIS standard, open

21 GT-XACML Integration l e. Xtensible Access Control Markup Language u OASIS standard, open source implementations l XACML: sophisticated policy language l Globus Toolkit ships with XACML runtime u Included in every client and server built on GT u Turned-on through configuration l … that can be called transparently from runtime and/or explicitly from application … l … and we use the XACML-”model” for our Authz Processing Framework

22 Globus Toolkit: Open Source Grid Infrastructure Data Replication Globus Toolkit v 4 www.

22 Globus Toolkit: Open Source Grid Infrastructure Data Replication Globus Toolkit v 4 www. globus. org Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework Web. MDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization Grid. FTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime I. Foster, Globus Toolkit Version 4: Software for Service-Oriented Systems, LNCS 3779, 2 -13, 2005

Managing Computers & Computation l l GRAM (Grid Resource Allocation & Management) service u

Managing Computers & Computation l l GRAM (Grid Resource Allocation & Management) service u Negotiate access u Stage code u Monitor service u Manage service u Collect accounting data Can negotiate access to clusters, creation of virtual machines, establishment of virtual networks, … Client GRAM 23

24 Usage: CPUs Dynamic Provisioning of Computational Services ATLAS DC 2 CMS DC 04

24 Usage: CPUs Dynamic Provisioning of Computational Services ATLAS DC 2 CMS DC 04 Open Science Grid use over 6 months

25 Dynamic Service Deployment Community A • Community scheduling logic • Data distribution •

25 Dynamic Service Deployment Community A • Community scheduling logic • Data distribution • Community management • Science services • Planet. Lab nodes • . . . … Community Z Requirements: • Community control • Persistence • Resource guarantees • Noninterference

26 Globus Toolkit: Open Source Grid Infrastructure Data Replication Globus Toolkit v 4 www.

26 Globus Toolkit: Open Source Grid Infrastructure Data Replication Globus Toolkit v 4 www. globus. org Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework Web. MDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization Grid. FTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime

27 Managing Storage & Data l l Service interfaces for managing storage & data

27 Managing Storage & Data l l Service interfaces for managing storage & data movement u Storage management (SRM, Ne. ST) u Data movement (Grid. FTP, RFT) u Replica management (RLS, DRS) Service interfaces for accessing data in diverse formats u OGSA Data Access & Integration u Grid. FTP data access & movement

Grid. FTP in GT 4 l Disk-to-disk on Tera. Grid 100% Globus code u

Grid. FTP in GT 4 l Disk-to-disk on Tera. Grid 100% Globus code u No licensing issues u Stable, extensible l IPv 6 Support l XIO for different transports l Striping multi-Gb/sec wide area transport u l 27 Gbit/s on 30 Gbit/s link Pluggable u Front-end: e. g. , future WS control channel u Back-end: e. g. , HPSS, cluster file systems u Transfer: e. g. , UDP, Net. BLT transport 28

29 Reliable File Transfer: Third Party Transfer l Fire-and-forget transfer l Web services interface

29 Reliable File Transfer: Third Party Transfer l Fire-and-forget transfer l Web services interface l Many files & directories l Integrated failure recovery l Has transferred 900 K files RFT Client SOAP Messages RFT Service Grid. FTP Server Master DSI Protocol Interpreter Grid. FTP Server Data Channel IPC Link IPC Receiver Notifications (Optional) Protocol Interpreter Master DSI IPC Link Slave DSI Data Channel Slave DSI IPC Receiver

30 Replica Location Service l Identify location of files via logical to physical name

30 Replica Location Service l Identify location of files via logical to physical name map l Distributed indexing of names, fault tolerant update protocols l l GT 4 version scalable & stable Managing ~40 million files across ~10 sites Index Local Update Bloom DB send filter (secs) (bits) 10 K <1 2 1 M 1 M 2 24 10 M 5 M 7 175 50 M

Reliable Wide Area Data Replication 31 LIGO Gravitational Wave Observatory Birmingham • §Cardiff AEI/Golm

Reliable Wide Area Data Replication 31 LIGO Gravitational Wave Observatory Birmingham • §Cardiff AEI/Golm Replicating >1 Terabyte/day to 8 sites >30 million replicas so far MTBF = 1 month

32 Data Replication Service: An Example of Service Composition At requesting site, deploy: l

32 Data Replication Service: An Example of Service Composition At requesting site, deploy: l WSRF services u u u l Data Replication Service Delegation Service Reliable File Transfer Service Pre-WSRF components u u Replica Location Service (Local Replica Catalog and Replica Location Index) Grid. FTP Server

Data Replication Service: WSDL (Port. Type) <? xml version=“ 1. 0” encoding=“utf-8”? > <wsdl:

Data Replication Service: WSDL (Port. Type) <? xml version=“ 1. 0” encoding=“utf-8”? > <wsdl: definitions name=“Replication” …> … <wsdl: port. Type name=“Replicator. Port. Type” wsrp: Resource. Properties=“Replicator. Resource. Properties”> <wsdl: operation name=“create. Replicator”> … <wsdl: operation name=“start” … <wsdl: operation name=“stop”> … <wsdl: operation name=“suspend”> … <wsdl: operation name=“resume”> … <wsdl: operation name=“find. Items”> … <wsdl: operation name=“Set. Termination. Time”> <wsdl: operation name=“Destroy”> … <wsdl: operation name=“Query. Resource. Properties”> … <wsdl: operation name=“Get. Multiple. Resource. Properties”> … <wsdl: operation name=“Get. Resource. Property”> … <wsdl: operation name=“Subscribe”> … <wsdl: operation name=“Get. Current. Message”> … </wsdl: port. Type> </wsdl: definitions> 33

Data Replication Service: WSDL (Resource Properties) <? xml version=“ 1. 0” encoding=“utf-8”? > <wsdl:

Data Replication Service: WSDL (Resource Properties) <? xml version=“ 1. 0” encoding=“utf-8”? > <wsdl: definitions name=“Replication” …> … <wsdl: port. Type name=“Replicator. Port. Type” wsrp: Resource. Properties=“Replicator. Resource. Properties”> <wsdl: operation name=“create. Replicator”> … <xsd: element name="Replicator. Resource. Properties“> <wsdl: operation name=“start” … … <wsdl: operation name=“stop”> … <xsd: element name=“status” …/> <wsdl: operation name=“suspend”> … <xsd: element name=“stage” …/> <wsdl: operation name=“resume”> … <xsd: element name=“result” …/> <wsdl: operation name=“find. Items”> … <xsd: element name=“error. Message” …/> <wsdl: operation name=“Set. Termination. Time”> <xsd: element name=“count” …/> <wsdl: operation name=“Destroy”> … <xsd: element <wsdl: operation name=“Query. Resource. Properties”> … name=“Topic” …/> <xsd: element name=“Topic. Expr. Dialect” …/> <wsdl: operation name=“Get. Multiple. Resource. Properties”> … <xsd: element name=“Temination. Time” …/> <wsdl: operation name=“Get. Resource. Property”> … <xsd: element name=“Current. Time” …/> <wsdl: operation name=“Subscribe”> … <xsd: element name=“Fixed. Topic. Set” …/> <wsdl: operation name=“Get. Current. Message”> … … </wsdl: port. Type> </xsd: element> </wsdl: definitions> 34

35 Globus Toolkit: Open Source Grid Infrastructure Data Replication Globus Toolkit v 4 www.

35 Globus Toolkit: Open Source Grid Infrastructure Data Replication Globus Toolkit v 4 www. globus. org Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework Web. MDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization Grid. FTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime

GT 4 Monitoring & Discovery WS-Service. Group Clients (e. g. , Web. MDS) GT

GT 4 Monitoring & Discovery WS-Service. Group Clients (e. g. , Web. MDS) GT 4 Container Registration & WSRF/WSN Access GT 4 Container MDSIndex Automated registration in container GRAM 36 MDSIndex adapter GT 4 Cont. Custom protocols for non-WSRF entities MDSIndex Grid. FTP User RFT

37 Summary l Services are typically stateful, but WS standards did not support stateful

37 Summary l Services are typically stateful, but WS standards did not support stateful entities l WSRF provides standards for management, identification, lifetime, inspection, & manipulation of stateful entities l GT 4 WS Core provides a rich environment for developing stateful services l GT 4 provides a rich set of services based on WSRF & WS-Notification

38 For More Information l Globus Alliance u l Global Grid Forum u l

38 For More Information l Globus Alliance u l Global Grid Forum u l www. globus. org www. ggf. org Background information u www. mcs. anl. gov/~foster 2 nd Edition www. mkp. com/grid 2