POOL and File Catalogues in LCG2 Flavia Donno

  • Slides: 16
Download presentation
POOL and File Catalogues in LCG-2 Flavia Donno Flavia. Donno@cern. ch CERN/INFN 6/12/2021 INFN

POOL and File Catalogues in LCG-2 Flavia Donno Flavia. Donno@cern. ch CERN/INFN 6/12/2021 INFN Grid Technical Board Bologna 1

Summary n n n 6/12/2021 Summary of problems with RLS in LCG-2 The POOL

Summary n n n 6/12/2021 Summary of problems with RLS in LCG-2 The POOL File Catalogue Interfacing the POOL File Catalogue with the RB Discussions about using catalogues in GRID Interaction with the SRM Conclusions F. Donno, INFN Grid Technical Board Bologna 2

Problems with RLS in LCG-2 Current CMS Data Challenges show clear problems using RLS

Problems with RLS in LCG-2 Current CMS Data Challenges show clear problems using RLS n Partially due to the normal “learning curve” on all sides in using a new system n Some reasons are n n n Not yet fully optimized service Inefficient use of language bindings and query facilities Few groups at CERN trying to understand the issues: n n 6/12/2021 Which queries are needed? How to structure the meta data? Which catalog interface? Which indices? F. Donno, INFN Grid Technical Board Bologna 3

Problems with RLS in LCG-2 n n But poor performance also due to known

Problems with RLS in LCG-2 n n But poor performance also due to known RLS design problems! File names and related meta data are used in one query n n Many catalog operations are bulk operations n n RLS split of mapping data from file meta data (LRC vs. RMC) results in rather poor performance for combined queries Forces the applications (eg POOL) to perform large joins on the client side rather than fully exploit the database backend Still acceptable performance and scalability needs a catalog design which keeps the data which is used in one query close to each other Current RLS interface is very low level and results in large overheads on bulk operations (too many network round-trips) Transaction support would greatly simplify the deployment n 6/12/2021 A partially successful bulk insert/update requires recovery “by hand” F. Donno, INFN Grid Technical Board Bologna 4

Problems with the RLS in LCG-2 Few ways to tackle the problem n Short

Problems with the RLS in LCG-2 Few ways to tackle the problem n Short term solutions n Improve RLS functionality n n n Medium term solutions (for DC 05) n Rethink the design of RLS and propose a new architecture n n n First list of queries and their relative importance received Full file registration and lookup (including meta data) in one roundtrip Full fragment registration and query (multiple files + their meta data) in a single transaction Even via the data management tools See Marco’s talk https: //edms. cern. ch/file/479608/2/evolution-dm-lcg 2 -v 2. pdf Long term solutions (ARDA/EGEE) Deal also with deployment (and replication/consistency issues) project-lcg-peb-distributed-database-deployment@cern. ch 6/12/2021 F. Donno, INFN Grid Technical Board Bologna 5

The POOL File Catalogue n n n The POOL file catalog maintains list of

The POOL File Catalogue n n n The POOL file catalog maintains list of accessible files together with their unique and immutable file Ids. Main “users” are the POOL storage components who consult the file catalog when a new file is to be accessed. The file catalog is also used to store some file related metadata (LFN). 6/12/2021 F. Donno, INFN Grid Technical Board Bologna 6

The POOL File Catalogue n Three different implementations are provided: n n n 6/12/2021

The POOL File Catalogue n Three different implementations are provided: n n n 6/12/2021 XML file catalogue for single users in a job. Information can be published in the other two implementations. A native My. SQL catalogue used in production farms. The content of this catalogue can be publish in the Grid flavored catalogue The RLS: this is the grid flavored catalogue F. Donno, INFN Grid Technical Board Bologna 7

The POOL File Catalogue n n POOL is already quite used by experiments (CMS

The POOL File Catalogue n n POOL is already quite used by experiments (CMS and ATLAS) for persistent data POOL offers a very flexible interface for dealing with metadata (attributes) It hides the details of the backend database from the user The POOL Native File Catalogue could be a good starting point to prototype what the applications would like to have. 6/12/2021 F. Donno, INFN Grid Technical Board Bologna 8

Interfacing POOL with the Resource Broker Information Service User Interface Use Replica Manager Client/

Interfacing POOL with the Resource Broker Information Service User Interface Use Replica Manager Client/ POOL Client Communication RLS/ POOL Native File Catalogue WP 2 communication Storage Element 6/12/2021 Storage Element Monitor Replica Optimisation Network Monitor F. Donno, INFN Grid Technical Board Bologna 9

Interfacing POOL with the Resource Broker/JDL n Input. Data (optional) n n Refers to

Interfacing POOL with the Resource Broker/JDL n Input. Data (optional) n n Refers to data used as input by the job: GUIDs, LFNs and/or PFNs Replica. Catalog n if specified it invokes special Replica Catalog handler helper In case of POOL it is a string with possibly a list of catalogues to use “mysqlcatalog_mysql: //user@localhost/mycat 1 xmlcatalog_file: mycat 2. xml” n n Data. Access. Protocol As before: the protocol or the list of protocols which the application is able to speak for accessing Input. Data on a given SE n n Output. SE (optional – as before) n The Uniform Resource Identifier of the output SE n RB uses it to choose a CE that is compatible with the job and is close to SE 6/12/2021 F. Donno, INFN Grid Technical Board Bologna 10

Discussions on Catalogues in LCG n At the experiment level to define the data

Discussions on Catalogues in LCG n At the experiment level to define the data structure n n People from all experiments participating to ARDA At the middleware level to provide a service component n LCG for middle/long term solutions and EGEE At the deployment level to solve scalability consistency and fault taulerant issues n n Proposed project by Dirk Duellmann 6/12/2021 F. Donno, INFN Grid Technical Board Bologna 11

Discussions on Catalogues in LCG Contains metadata information Lightweight database name->value per GUID New

Discussions on Catalogues in LCG Contains metadata information Lightweight database name->value per GUID New LCG proposal Site 1 Experiment database RRS LFN->GUID<-PFN Collections GUID->Attributes Central Catalog Site 2 Site 3 RRS 6/12/2021 RRS Connection reuse Timeout retries Sync and Async, queing, Transactions exposed to users F. Donno, INFN Grid Technical Board Bologna 12

Interactions with SRM Large (Tier-1) sites with existing MSS solutions and Petabytes of data

Interactions with SRM Large (Tier-1) sites with existing MSS solutions and Petabytes of data (e. g. CERN, FNAL, . . . ) -> The best solution would be for the MSS vendor to provide an SRM interface to the existing MSS code. n Large (Tier-2) sites with 10 -100 TB of data ->This requires a disk pool manager in order to handle many different storage and transport nodes, but hide them behind a single SRM interface (d. Cache SRM, Berkley DRM, Castor SRM) n Small (Tier-2) sites with < 10 TB -> These sites require a solution that scales to a few disk server nodes, and that is lightweight and easy to maintain. n CERN is working on the case 3 6/12/2021 F. Donno, INFN Grid Technical Board Bologna 13

Interactions with SRM Not clear that a distributed catalogue is needed for the Lightweight

Interactions with SRM Not clear that a distributed catalogue is needed for the Lightweight disk pool. n n If so, the proposed file catalogue can be a candidate 6/12/2021 F. Donno, INFN Grid Technical Board Bologna 14

Conclusions n The catalogues represent a BIG open issue … n. People from all

Conclusions n The catalogues represent a BIG open issue … n. People from all experiments are strongly involved Prototyping helps understand the requirements (POOL interface to the RB goes into this direction) n Higher levels efforts will use solutions proposed by the deployment team to solve scalability, consistency and fault tolerant issues n n Proposed project by Dirk Duellmann SRM interactions not a big issue for the moment. Lower level layer n 6/12/2021 F. Donno, INFN Grid Technical Board Bologna 15

And. . . ? ? ? 6/12/2021 F. Donno, INFN Grid Technical Board Bologna

And. . . ? ? ? 6/12/2021 F. Donno, INFN Grid Technical Board Bologna 16