SA 1 Data Grid Interoperation Enabling Grids for
SA 1 – Data Grid Interoperation Enabling Grids for E-scienc. E Grid Data Interoperation (Part II): Data, Metadata, Catalogues Grid. FTP SRM Info Data Mode 1: Pretend SRB is a “Classic SE” Classic SE (still) supported by g. Lite FTS DPM, d. Cache, Sto. RM, CASTOR, … SRB FTS Data Interoperation SRM selects pool node… Grid. FTP SRB Catalogue User Domain SRB SURLs are GSIFTP URLs; SRB TURLs are the same as the SURLs. lcg-* tools do not accept GSIFTP SURLs LFN GUID LFC Domain (LCG File Catalogue) SURL Proposal Instrument/experiment Data Mode 3: using lcg-utils TURL Metadata held in separate catalogue, the i. CAT (not to be confused with the i. RODS catalogue). Format is XML. i. CAT uses Oracle. File Metadata migration File Data is held in SRB’s metadata facility is hardly used. Grid. FTP Disk storage The former makes sense on the grid: register meaningful LFNs to point to GUID in LFC. The latter does not depend on LFC/replications. Metadata is associated to primary key. Avoid metadata in filenames! Data mover FTS still supports “Classic SE” ASGC SRM interface to SRB will become preferred Doesn’t move metadata though. Dataset attributes: • Date, owner, run title, status, keywords, location) • Currently managing datasets with separate dataset sequence attribute: Once one file is found, the rest of the dataset is located Mirrors original use where metadata is kept with a single file i. CAT metadata is hierarchical: associated with individual file or dataset. Current support is simplistic: dump metadata with any file in dataset (works in current limited scenarios) Experiences SRB metadata (key/value pairs) i. CAT schema: Only basic attributes so far (datasets, instrument, owner) i. CAT in Google code: http: //code. google. com/p/icatproject/ References: S Burke et al: g. Lite User Guide, CERN EDMS 722398 J Jensen, R Downing, M Hodges, D Ross: SRM and SRB interoperation F Bonifazi et al: LHCb experience with LFC replication, Proc CHEP 2007 M Gleaves: ICAT software suite EGEE-III INFSO-RI-222667 BDII /grid/isis/guid/c 4756 f 6 e-7963 -47 ad-ac 8 a-59726 afa 4992 vs /grid/isis/NDXINTER/Instrument/data/cycle_08_5/INTER 00000544. raw Dataset File Parameters SRM Two approaches to file metadata management (primary key): 1. Use the GUID as filename – shallow hierarchy 2. Use the original filename (or algorithmically derived name) Strategy: always clone file to SRM, then register clone in LFC. (Fallback: register Grid. FTP SURL in other catalogue, or hack LFC, or use AMGA to keep track of replicas. ) Neutron source at RAL Can also use to move data to/from disks with Grid. FTP lcg-* Add a static information provider using BDII SRB Storage Elements SURL TURL Metadata ISIS Grid. FTP Disk storage Three interoperation modes for data transfers: 1. FTS 2. SRM drives transfer via srm. Copy() (not shown) 3. lcg-utils Needs glue to make it work together. Can improve on original use, maybe Had to custom build metadata schema on g. Lite side Custom build metadata copier TODO: Other SRB users: e. Minerals, e. Materials, RMCS Work on integration with job submission, maybe portal Track work on datasets: provenance Improve metadata support Data mover todo: Improve Modularise metadata porting Generalise? Authors: Jensen, STFC (corresponding) Sam Skipsey, University of Glasgow Chris Moreton-Smith, ISIS, STFC Special thanks to Michael Gleaves and Brian Matthews, STFC, for i. CAT discussions, and to Birger Koblitz, CERN, for AMGA support/suggestions http: //www. ngs. ac. uk/ http: //www. isis. rl. ac. uk/ http: //www. gridpp. ac. uk/
- Slides: 1