POSIXlike OGSASOAP Services Arun Jagatheesan Architect Team Lead

  • Slides: 14
Download presentation
POSIX-like OGSA/SOAP Services Arun Jagatheesan Architect & Team Lead, SDSC Matrix San Diego Supercomputer

POSIX-like OGSA/SOAP Services Arun Jagatheesan Architect & Team Lead, SDSC Matrix San Diego Supercomputer Center GFS, Global Grid Forum-9 October 7, 2003, Chicago National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center

Talk Outline • • • Grid File System The small big picture Need for

Talk Outline • • • Grid File System The small big picture Need for Schema Need for Operation definitions Data Transport National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 2

Grid File System Applications (Astronomy, Physics, Life Science, business apps, . . . )

Grid File System Applications (Astronomy, Physics, Life Science, business apps, . . . ) Hierarchical Logical Name space, ACL, metadata Coordinated with other groups Grid File System Service (POSIX-like Interface) NFS/CIFS … Virtual Directory Service (Management of virtualization) Data Services Data Sources National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 3

OGSA/SOAP based interfaces for file operations The small big picture Grid File System Service

OGSA/SOAP based interfaces for file operations The small big picture Grid File System Service (POSIX-like Interface) XML Schema for Collections, Data Sets NFS or other standard interface over the virtualized schema NFS/CIFS … Virtual Directory Service (Management of virtualization) Data Services Data Sources National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 4

Grid Collection Schema • XML Schema based Description for • • • Collections or

Grid Collection Schema • XML Schema based Description for • • • Collections or Virtual Directories Data Sets File System Meta-data (file size, date created, …) Application Specific Meta-data Access Permissions … • Logical Name space • • Extensible Scalable (more federations) Dynamic Composition of the name space Import and Export National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 5

Operations on Logical Namespace • OGSA/SOAP based interfaces • Grid File System operations •

Operations on Logical Namespace • OGSA/SOAP based interfaces • Grid File System operations • Similar to traditional file systems operations / POSIX • Open (= Get a GSR? ), Read, Seek’n’Write, … • Simple Control (Context) Operations • Management of Logical Namespace • SOAP based bindings • Bulk (Content) Operations • Only SOAP bindings for data transport ? ? ? (NOPE) • Alternative mechanisms needed in standard National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 6

How do we form the logical namespace? National Partnership for Advanced Computational Infrastructure San

How do we form the logical namespace? National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 7

Logical Layers (bits, data, information, . . ) Collections or Virtual Directories my. Active.

Logical Layers (bits, data, information, . . ) Collections or Virtual Directories my. Active. Neuro. Collection patient. Records. Collection Virtual Data Transparency image. cgi image. wsdl image. sql Data Replica Transparency image_0. jpg…image_100. jpg Data Identifier Transparency E: srb. Vaultimage. jpg /users/srb. Vault/image. jpg Select … from srb. mdas. td where. . . Storage Location Transparency Storage Resource Transparency National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 8

Storage Resource Transparency (1) • Storage repository abstraction • Archival systems, file systems, databases,

Storage Resource Transparency (1) • Storage repository abstraction • Archival systems, file systems, databases, FTP sites, … • Logical resources • • • Combine physical resources into a logical set of resources Hide the type and protocol of physical storage system Load balancing – based on access patterns Unlike DBMS, user is aware of logical resources Flexibility to changes in mass storage technology National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 9

Storage Resource Transparency (2) • Standard operations at storage repositories • POSIX like operations

Storage Resource Transparency (2) • Standard operations at storage repositories • POSIX like operations on all resources • Storage specific operations • Databases - bulk metadata access • Object ring buffers - object based access • Hierarchical resource managers - status and staging requests National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 10

Storage Location Transparency • Support replication of data for performance • Transparent access to

Storage Location Transparency • Support replication of data for performance • Transparent access to physical location and physical resource • Virtualization of distributed data resources • Data naming managed by the data grid • Redundancy for preservation • Resource redundancy – “m of n” resources in list • Location redundancy – replicate at multiple locations National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 11

Data Identifier Transparency • Four Types of Data Identifiers: 1. Unique name • OID

Data Identifier Transparency • Four Types of Data Identifiers: 1. Unique name • OID or handle 2. Descriptive name • • Descriptive attributes – meta data Semantic access to data 3. Collective name • • Logical name space of a collection of data sets Location independent 4. Physical name • Physical location of resource and physical path of data National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 12

Data Replica Transparency • Replication • • Improve access time Improve reliability Provide disaster

Data Replica Transparency • Replication • • Improve access time Improve reliability Provide disaster backup and preservation Physically or Semantically equivalent replicas • Replica consistency • Synchronization across replicas on writes • Updates might use “m of n” or any other policy • Distributed locking across multiple sites • Versions of files • Time-annotated snapshots of data National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 13

Conclusion • Lot of possibilities • Need for a Standard Grid File Schema and

Conclusion • Lot of possibilities • Need for a Standard Grid File Schema and Global Logical Namespace for virtualization • Need for Standard description of Operations or Grid File System Service • Call for • Users, Projects • Developers, Vendors • It’s a stone’s throw away – together, we will do it. National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center University of Florida 14