EDG Replica Manager and Replica Location Service Status

  • Slides: 17
Download presentation
EDG Replica Manager and Replica Location Service Status and Plans Leanne Guy Data Management

EDG Replica Manager and Replica Location Service Status and Plans Leanne Guy Data Management Work Package: WP 2 Leanne. Guy@cern. ch http: //cern. ch/grid-data-management

EDG Replication Services Reptor Replica Manager Client Optimization Optor Transaction Consistency File Transfer Postprocessing

EDG Replication Services Reptor Replica Manager Client Optimization Optor Transaction Consistency File Transfer Postprocessing Preprocessing Subscription 30/09/2020 Replica Location Giggle GDMP Replica Metadata Rep. Me. C LCG Persistency Workshop 2

Replica Manager components (1) Reptor: Replication Manager Ø Ø Ø Replication management system. Entry

Replica Manager components (1) Reptor: Replication Manager Ø Ø Ø Replication management system. Entry point for all clients Triggers automated replication of files Giggle: Replica Location Service Ø Ø Ø Local Replica Catalog services LRC: LFN-PFN mappings Replica Location Index services RLI: index on LFNs Set of configurable servers GDMP: GRID Data Mirroring Package Ø Ø Automated replication of files all over the GRID Storage Elements Automatic updating of the replica catalog Rep. Mec: Replication Metadata Catalogue Ø An instance of Spitfire with RDBMS backend and specialized schema 30/09/2020 LCG Persistency Workshop 3

Replica Manager Components (2) Optor: Optimisation service Ø Ø Replica Selection based on economic

Replica Manager Components (2) Optor: Optimisation service Ø Ø Replica Selection based on economic modelling Automated replication for load balancing Processing Ø Hooks for pre- and postprocessing while replicating Transaction Ø Ø Ensure atomic ‘replication’ functionality Robustness of service Consistency Ø Ø Ø Check consistent state of Replication Services Ensure consistent view of files in RLS and SRM Ensure consistent Master file attribute File Transfer Ø Grid. FTP and other protocols 30/09/2020 LCG Persistency Workshop 4

Replica Manager Architecture User Interface Replica Location Index Replica Metadata Catalogue Resource Broker Site

Replica Manager Architecture User Interface Replica Location Index Replica Metadata Catalogue Resource Broker Site Core API Site Replica Manager Local Replica Catalogue Replica Manager Optimisation API Optimiser Processing API Pre-/Postprocessing Computing Element 30/09/2020 Storage Element LCG Persistency Workshop Computing Element Local Replica Catalogue Storage Element 5

File mappings Logical RLS Rep. Me. C LFN 1 LFN 2 LFN 3 Physical

File mappings Logical RLS Rep. Me. C LFN 1 LFN 2 LFN 3 Physical PFN 1 LFN 0 / Grid Unique ID PFN 2 PFN 3 LFNn PFNn A File 30/09/2020 LCG Persistency Workshop 6

Replica location problem Given a unique logical identifier for some given data, determine the

Replica location problem Given a unique logical identifier for some given data, determine the physical location of one or more physical instances of this data Replica Location Service: Ø maintains information on the physical location of files Ø maintains mapping between the logical identifier of data and all its physical instances Ø provides access to this information 30/09/2020 LCG Persistency Workshop 7

RLS Requirements Versioning & read only data Ø Ø Size Ø Ø knowledge and

RLS Requirements Versioning & read only data Ø Ø Size Ø Ø knowledge and existence of private data must be protected storage system protects integrity of data content Consistency Ø 200 updates/second, average response time < 10 ms Security Ø scale to hundreds of replica sites, 50 x 108 LFNs, 500 x 108 PFNs Performance Ø distinct versions of files can be uniquely identified data published to the community are immutable view of all available PFNs not necessarily consistent Reliability Ø Ø no single point of failure, local and global state decoupled, Ø failure of remote component does not hinder access to local component 30/09/2020 LCG Persistency Workshop 8

Giggle Framework Giggle: A Framework for Constructing Scalable Replica Location Services Ø Joint collaboration

Giggle Framework Giggle: A Framework for Constructing Scalable Replica Location Services Ø Joint collaboration between WP 2 and Globus Ø Paper submitted to SC 2002 Independent local state maintained in Local Replica Catalogues : LRCs Unreliable collective state maintained in Replica Location Indices : RLIs Soft state maintenance of RLI state Ø Compression of soft states Ø relaxed consistency in the RLI, full state information in LRC compress LFN information based on knowledge of logical collections Membership and partitioning information maintenance Ø Ø RLS components change over time : failure, new components added Service discovery and system policies 30/09/2020 LCG Persistency Workshop 9

Local Replica Catalogue (LRC) Ø Maintains replica information at a single site Ø Ø

Local Replica Catalogue (LRC) Ø Maintains replica information at a single site Ø Ø Ø Maintains mappings between LFNs and PFNs on associated storage systems Ø Ø Coordinates its contents with those of the storage system Responds to the following queries: Ø Ø Complete locally consistent record Queries across multiple sites not supported Given an LFN, find the set of PFNS associated with that LFN Given a PFN, find the set of LFNS associated with that PFN Supports authentication and authorisation when processing remote requests Periodically sends information about its state to the RLIs 30/09/2020 LCG Persistency Workshop 10

Replica Location Index (RLI) Index structure needed to support queries across multiple sites Ø

Replica Location Index (RLI) Index structure needed to support queries across multiple sites Ø One or more RLIs to map LFNs to LRCs Ø Ø Ø Geographical partitioning – all PFNs of a set of LRCs are indexed Namespace partitioning 1 – for load balancing purposes Namespace partitioning 2 – only LFNs adhering to a specified pattern are indexed for all LRCs Ø Ø Structure w. r. t LRCs can be freely defined redundancy, performance, scalability possibly not good for load balancing Many identical RLIs may be set up for load balancing 30/09/2020 LCG Persistency Workshop 11

RLS Architecture (1) A 2 level RLS layout: The RLIs contain pointers to LRCs

RLS Architecture (1) A 2 level RLS layout: The RLIs contain pointers to LRCs only. RLI LRC Multiply indexed LRC for higher availability LRC indexed by only one RLI indexing over the full namespace (all LRCs are indexed) RLI indexing over a subset of LRCs 30/09/2020 LCG Persistency Workshop 12

RLS Architecture (2) A hierarchical RLS topology: The RLIs point to LRCs and RLIs

RLS Architecture (2) A hierarchical RLS topology: The RLIs point to LRCs and RLIs RLI RLI LRC LRC Multiply indexed LRC for higher availability RLI LRC LRC indexed by only one RLI indexing over the full namespace (all LRCs are indexed) RLI indexing over a subset of LRCs 30/09/2020 LCG Persistency Workshop 13

RLS Server Prototype client LRC/RLI server Prototype implementation: Implemented in C; relies on ØGrid

RLS Server Prototype client LRC/RLI server Prototype implementation: Implemented in C; relies on ØGrid Security Infrastructure Øglobus_io_socket layer Ø ODBC (libiodbc) My. ODBC My. SQL Ø Multithreaded server configure as an LRC and/or RLI server Ø Database 30/09/2020 LCG Persistency Workshop 14

Performance results (1) Preliminary results only ! Ø Ø Platforms Ø Ø Ø Number

Performance results (1) Preliminary results only ! Ø Ø Platforms Ø Ø Ø Number of entries in LRC from 0 -> 1000 k ~ 15 -16 ms , no noticeable increase in time with database size Time to perform soft state update Ø Ø Solaris 2. 8 (US) Red Hat Linux 6. 1 (CERN) Time to add/create/delete/read an LFN entry Ø Ø Performance results document will be released soon Increases linearly with the number of entries in the LRC ~8 secs for 1000 entries in LRC ~10000 secs for 1000 k entries in the LRC ~1667 queries/sec, 67 updates/second 30/09/2020 LCG Persistency Workshop 15

Release plans Ø Ø Ø Ø RLS is currently installed on 4 EDG nodes

Release plans Ø Ø Ø Ø RLS is currently installed on 4 EDG nodes at CERN Current release is an alpha : Initial testing debugging completed Giggle paper submitted to SC 2002 with preliminary performance results Final performance results expected by end July Expect to have an RLS - RPM by end of June 2002 Expect to have a full set of integrated replication services for testbed 2 by the end of September 30/09/2020 LCG Persistency Workshop 16

Future work Ø Web Services paradigm – the RLS is one of the early

Future work Ø Web Services paradigm – the RLS is one of the early adopters of the Open Grid Services Architecture Ø Ø OGSA interface will be available by the end of this year, still needs exact definition Compression of RLI state – bloom filters. 30/09/2020 LCG Persistency Workshop 17