WLCG DOMA report Maria Girone Simone Campana CERN

  • Slides: 13
Download presentation
WLCG DOMA report Maria Girone, Simone Campana CERN Simone. Campana@cern. ch - WLCG GDB

WLCG DOMA report Maria Girone, Simone Campana CERN Simone. Campana@cern. ch - WLCG GDB 17/10/2018 1

The WLCG Data Organization Management Access evolution project (DOMA) Ø keep track of developments

The WLCG Data Organization Management Access evolution project (DOMA) Ø keep track of developments and advancements in all DOMA areas Ø provide a forum to discuss ideas and foster interoperability of solutions Ø an umbrella for experiments, middleware developers and storage providers, facilities Simone. Campana@cern. ch - WLCG GDB 17/10/2018 2

Introduction § Meeting every 4 th Wednesday of each month, from 16: 00 (finishing

Introduction § Meeting every 4 th Wednesday of each month, from 16: 00 (finishing before 17: 30) § One topical discussion at every meeting + short report for each working group § DOMA general Mailing List: wlcg-doma (at) cern. ch § Three active working groups: Ø Third Party Copy (TPC) protocols: wlcg-doma-tpc (at) cern. ch Ø Data Access, Content Delivery and Caching (ACCESS): wlcg-doma-access (at) cern. ch Ø Storage Quality of Service (Qo. S): wlcg-doma-qos (at) cern. ch § DOMA twiki: https: //twiki. cern. ch/twiki/bin/view/LCG/Doma. Activities Ø The twikis of the working groups are all linked from here Simone. Campana@cern. ch - WLCG GDB 17/10/2018 3

TPC § Chairs: Alessandra Forti and Brian Bockelman § https: //twiki. cern. ch/twiki/bin/view/LCG/Third. Party.

TPC § Chairs: Alessandra Forti and Brian Bockelman § https: //twiki. cern. ch/twiki/bin/view/LCG/Third. Party. Copy § Short term goal: investigate, commission and deploy alternative TPC protocols to grid. FTP Ø Three phases (milestones) finishing in Dec 2019 with all sites providing storage to WLCG offering a non-grid. FTP endpoint § Medium term goal: prototype token-based auth in TPC Ø Focus on bearer tokens (capability based): Macaroons and Sci. Tokens Ø In line with the WLCG AAI task force (see later) § Xrootd and HTTP/Dav are the candidate protocols Simone. Campana@cern. ch - WLCG GDB 17/10/2018 4

§ Initially focusing on functionality. Performance will follow. § Need to test the full

§ Initially focusing on functionality. Performance will follow. § Need to test the full matrix of storage technologies for each protocol Ø Through Rucio+FTS: an instance of Rucio was deployed for this by the Rucio team on Kubernetes § Current status: many caveats being addressed. Documentation is an important one. Simone. Campana@cern. ch - WLCG GDB 17/10/2018 5

ACCESS § A broad scope WG: data access performance, content delivery and caching §

ACCESS § A broad scope WG: data access performance, content delivery and caching § Chairing team: S. Jezequel, I. Vukotic, F. Wuerthwein, X. Espinal, M. Schulz § https: //twiki. cern. ch/twiki/bin/view/LCG/Content. Delivery. Caching § Started by looking at existing activities in this broad domain and organize the information by topic Ø https: //docs. google. com/document/d/1 Sk 5 wt. FLd. HDCjyc_Vm. Tw. JY 4 qzk_Er. Kid Bs 7 GLh. Tn. N 4 xo/edit § Two main topics: Ø Data access patterns and access performance studies Ø Organizing and deploying caching solutions Simone. Campana@cern. ch - WLCG GDB 17/10/2018 6

§ A lot of activity around caching, including latency hiding and bandwidth levelling §

§ A lot of activity around caching, including latency hiding and bandwidth levelling § “Cache performance will completely depend on how we will use them” § Two very interesting studies Ø Caching simulation based on access records in MWT 2. Study of the cache effectiveness based on workflows/file type Ø Concluding that caching has to be “content and workflow”-aware § Xcache configuration and deployment options, with pros and cons Ø Local node -> In parallel to managed storage -> Replacing managed storage -> multi -level cache (node, site, region) § https: //indico. cern. ch/event/763847/ Simone. Campana@cern. ch - WLCG GDB 17/10/2018 7

Qo. S performance § Chaired by Paul Millar § Recently started, mandate under construction

Qo. S performance § Chaired by Paul Millar § Recently started, mandate under construction § However, the high level goals are: Ø At the storage level, define, implement and expose different classes based on performance/reliability need and what you can afford Ø Integrate the notion of storage classes in the higher levels, such as experiment DDM systems. Leveraging on XDC cost reliability § Not a new concept: What we call “Disk” and “Tape” are in fact Qo. S. § A potential source of large hardware saving for HL-LHC and favors the integration of new storage technologies § Trade off performance, reliability and cost based on the use case Simone. Campana@cern. ch - WLCG GDB 17/10/2018 8

DOMA related network activities § Network R&D activities, focusing on data transfer Ø DTNs,

DOMA related network activities § Network R&D activities, focusing on data transfer Ø DTNs, low level transfer protocols, bandwidth on demand, P 2 P channels, SDNs, … § Collaboration with the SKA AENEAS project and HEPIX § Leveraging information from FTS as file transfer manager § http: //cern. ch/go/7 qx. T Simone. Campana@cern. ch - WLCG GDB 17/10/2018 9

DOMA and AAI § AAI evolution in WLCG is driven by the WLCG Auth.

DOMA and AAI § AAI evolution in WLCG is driven by the WLCG Auth. Z WG Ø Prototyping an architecture of which DOMA activities are one aspect Ø X 509 free, based on Jason Web Tokens § The WG collected requirements and is evaluating existing solutions for the WLCG MB § All inline with the DOMA needs and strategy § http: //cern. ch/go/9 f. Xq Simone. Campana@cern. ch - WLCG GDB 17/10/2018 10

Dude, where is the Data Lake WG? ? ? ? Simone. Campana@cern. ch -

Dude, where is the Data Lake WG? ? ? ? Simone. Campana@cern. ch - WLCG GDB 17/10/2018 11

§ DOMA works bottom-up: does not defines an architecture at this stage § We

§ DOMA works bottom-up: does not defines an architecture at this stage § We have indications of how facilities and services will look like and we define an R&D program to prototype various aspects The current DOMA working groups will enable the technology for a Data Lake We have no WG dedicated to distributed storage, but the challenges are being addressed in ACCESS, Network, Qo. S, TPC Qo. S We had no real discussion on interoperability services: Rucio being used by ATLAS and CMS (with interest of other communities) opens a big opportunity ACCESS, Network TPC, Network Simone. Campana@cern. ch - WLCG GDB 17/10/2018 12

Conclusions § There is a lot of work going on, inline with the WLCG

Conclusions § There is a lot of work going on, inline with the WLCG Strategy document § We also have a list of topics we have not yet discussed (tape carousels, interoperability services) § The first milestone is the preparation for the LHCC review of the strategy, where we intend to present our findings § We will finalize such preparation at the HSF/WLCG/OSG workshop in JLAB § We work in synergy with existing projects/initiatives. E. g. XDC and in the future ESCAPE § DOMA WGs are led by experiments, facilities and middleware providers. Working very effectively Simone. Campana@cern. ch - WLCG GDB 17/10/2018 13