EMI Data the second year Patrick Fuhrmann EMI

  • Slides: 32
Download presentation
EMI Data, the second year Patrick Fuhrmann, EMI Vancouver, CA , 27. 10. 2011

EMI Data, the second year Patrick Fuhrmann, EMI Vancouver, CA , 27. 10. 2011 Happy 20’th anniversary EMI is partially funded by the European Commission under Grant Agreement RI-261611

Content • Reminder – EMI in general – EMI release plan – What happens

Content • Reminder – EMI in general – EMI release plan – What happens after EMI – – – 10/27/11 Catalogue Synchronization FTS 3 : plans Data Client Library consolidation Web. DAV for d. Cache/DPM and LFC p. NFS for d. Cache and DPM Update on SE’s • DPM • d. Cache With contributions by • • Vancouver, HEPIX, EMI Ricardo Rocha Paul Millar Zsolt Molnar Tigran Mkrtchyan Jon Kerr Nilsen Alejandro Ayllon Fabrizio Furano Alberto Di Meglio (Boss) 2 EMI INFSO-RI-261611 • EMI Data in a nutshell • Selected topics

10/27/11 Vancouver, HEPIX, EMI 3 EMI INFSO-RI-261611 Just in case …

10/27/11 Vancouver, HEPIX, EMI 3 EMI INFSO-RI-261611 Just in case …

EMI factsheets 10/27/11 Vancouver, HEPIX, EMI 4 EMI INFSO-RI-261611 EMI in general

EMI factsheets 10/27/11 Vancouver, HEPIX, EMI 4 EMI INFSO-RI-261611 EMI in general

Where we are Before EMI 3 years After EMI Stolen from Alberto Di Meglio

Where we are Before EMI 3 years After EMI Stolen from Alberto Di Meglio Applications Integrators, System Administrators Standard interfaces Specialized services, professional support and customization Standard interfaces Standards, New technologies (clouds) Users and Infrastructure Requirements 10/27/11 Vancouver, HEPIX, EMI 5 EMI INFSO-RI-261611 EMI Reference Services

Release and support policy Kebnekaise Matterhorn Lappland, Sw, 2100 m Giebnegáisi Swiss, Italy, 4478

Release and support policy Kebnekaise Matterhorn Lappland, Sw, 2100 m Giebnegáisi Swiss, Italy, 4478 m Done Start EMI 0 EMI 1 Stolen from Alberto Di Meglio In Preparation EMI 2 EMI 3 ases r rele Majo Supp. & Maint. Support & Maintenance 01/05/2010 10/27/11 31/10/2010 30/04/2012 Vancouver, HEPIX, EMI 28/02/2013 6 EMI INFSO-RI-261611 Support & Maintenance

What happens after May 2013 ? • • • o o 10/27/11 Benefits for

What happens after May 2013 ? • • • o o 10/27/11 Benefits for the customers ? Benefits for the PT’s ? Vancouver, HEPIX, EMI 7 EMI INFSO-RI-261611 Not clear. The EU reviewers strongly recommended to put more efforts into future planning. Strategic directory has been nominated and is now in place. NA 3 together with the SD has to find a sustainability model for the time beyond EMI. Organization similar to ‘Apache’ is in discussion, combining the different product teams to an open source initiative. (NOT a new EMI EU project). • •

EMI factsheets 10/27/11 Vancouver, HEPIX, EMI 8 EMI INFSO-RI-261611 And now to EMI -

EMI factsheets 10/27/11 Vancouver, HEPIX, EMI 8 EMI INFSO-RI-261611 And now to EMI - Data

EMI Data Marketing Improving existing Components Integration Improving user satisfaction 10/27/11 Vancouver, HEPIX, EMI

EMI Data Marketing Improving existing Components Integration Improving user satisfaction 10/27/11 Vancouver, HEPIX, EMI 9 EMI INFSO-RI-261611 Standardization

Objectives in a nutshell q q 10/27/11 Improving existing infrastructures Ø GLUE 2. 0

Objectives in a nutshell q q 10/27/11 Improving existing infrastructures Ø GLUE 2. 0 Ø FTS 3 (next generation File Transfer Services) Ø Storage element and catalogue synchronization Integration Ø ARGUS integration Ø UNICORE integration Ø EMI Common data library Standardization Ø SRM over SSL including delegation Ø POSIX file access / NFS 4. 1 / p. NFS Ø Web. DAV for file and catalogue access Ø Storage Accounting Record implementation Ø EMI Data clouds Vancouver, HEPIX, EMI 10 EMI INFSO-RI-261611 q

Objectives in a nutshell (cont) Improved user satisfaction Ø Adhering operating system standards for

Objectives in a nutshell (cont) Improved user satisfaction Ø Adhering operating system standards for service operation and control, regarding configuration, log, temporary file location and service start/status/stop Ø Providing and supporting monitoring probes for EMI services Ø Improving usability of client tools, based on customer feedback by ensuring • better, more informative, less contradictory error messages • coherency of command line parameters. Ø Porting, releasing and supporting EMI components on identified platforms (full distribution on SL 6 and Debian 6, UI on SL 5/32 and the latest UBUNTU) Ø Introducing minimal denial of service protection for EMI services via configurable resource limits. Ø Providing optimized semi-automated configuration of service back-ends (e. g. databases) for standard deployments. 10/27/11 Vancouver, HEPIX, EMI 11 EMI INFSO-RI-261611 q

Content of this presentation 10/27/11 Vancouver, HEPIX, EMI 12 EMI INFSO-RI-261611 Some selected topics

Content of this presentation 10/27/11 Vancouver, HEPIX, EMI 12 EMI INFSO-RI-261611 Some selected topics

SE and catalogue synchronization q Storage element and catalogue synchronization Ø Event based synchronizing

SE and catalogue synchronization q Storage element and catalogue synchronization Ø Event based synchronizing of data location information between SE’s and catalogues. Ø Supposed to solve : • Dangling reverences in catalogues (pointers to lost files) • Synchronizing access permission information between SE’s and catalogues ? Doesn’t solve : • Dark data (File in SE’s which are not referenced from catalogues) DPM, Sto. RM or d. Cache Generic Adapter LFC or experiment catalogue Command Line Interface SE or Catalogue specific plug-in List of removed files Generic Adapter Messaging infrastructure 10/27/11 Vancouver, HEPIX, EMI 13 EMI INFSO-RI-261611 Ø

The new FTS : FTS 3 Next generation File Transfer Services, FTS 3 Ø

The new FTS : FTS 3 Next generation File Transfer Services, FTS 3 Ø Redesign based on experience of last years Ø Based on GFAL-2 Ø Decommission of channel concept. Ø Prototype ready in April ’ 12 (Framework for new approaches) Ø Many interesting new approaches 10/27/11 • Support of http including 3 rd party copy (delegation) • Feedback of real resource utilization ² Interactively ² Automatically (callout to storage elements) ² Autonomously (learning) Vancouver, HEPIX, EMI 14 EMI INFSO-RI-261611 q

The consolidated EMI-Data Lib October 2011 : Deliver consolidation plan in EMI ü q

The consolidated EMI-Data Lib October 2011 : Deliver consolidation plan in EMI ü q q q Draft exists, main ideas ready December 2011 : Finish prototype implementation ü Prototype should be ready for EMI-2 ü Merging 2 data libraries in two month is challenging ü Initial work already started 2012 Testing ü Many crucial components are affected ü Plenty of testing needed to achieve production quality December 2012 : Finish migration to EMI data 10/27/11 Vancouver, HEPIX, EMI 15 EMI INFSO-RI-261611 q

Web. DAV front end for LFC/SE’s LFC storage element Web DAV storage element ROOT

Web. DAV front end for LFC/SE’s LFC storage element Web DAV storage element ROOT Prototype works with LFC / DPM / d. Cache No aggregation library but using natural http protocol redirection BUT : Completely ignoring SRM semantics Has to be fixed by e. g. new entries in LFC or http/REST mapping service instead of SRM. 10/27/11 Vancouver, HEPIX, EMI 16 EMI INFSO-RI-261611 q q storage element

News on NFS 4. 1 / p. NFS q p. NFS is a done

News on NFS 4. 1 / p. NFS q p. NFS is a done deal q d. Cache ü DESY Grid Lab Tier II continues testing and improvements ü Production : Photon science people at DESY DPM q “burn in” testing phase with large (400 -1000 core) system in Taipei ü RH 6. 2 is coming with p. NFS enabled kernel q SL 6 will follow within weeks after 6. 2 is official. ü ü X 509 Authentication (possible solution discussed in Padova, EMI AHM) ü Wide area transfer evaluation (DESY Grid. Lab, SFU, CERN, Taipei) 10/27/11 Vancouver, HEPIX, EMI 17 EMI INFSO-RI-261611 Open questions q

SE’s in EMI 10/27/11 Vancouver, HEPIX, EMI 18 EMI INFSO-RI-261611 Breaking news : DPM

SE’s in EMI 10/27/11 Vancouver, HEPIX, EMI 18 EMI INFSO-RI-261611 Breaking news : DPM

 • Ricardo replaced Jean-Philippe as DPM/LFC PI. • DPM 1. 8. 2 –

• Ricardo replaced Jean-Philippe as DPM/LFC PI. • DPM 1. 8. 2 – Improved scalability of all frontend daemons • Especially with many concurrent clients – Faster DPM drain – Better balancing of data among disk nodes • Different weights to each filesystem • Improved validation & testing – Collaboration with ASGC for this purpose (thanks!) – Hammercloud tests running regularly – They started with a 400 core setup, we looked at the issues, now moving to 1000 cores to increase load 10/27/11 Vancouver, HEPIX, EMI 19 EMI INFSO-RI-261611 News from DPM

Future releases : DPM (provided by Ricardo) Package consolidation: EPEL compliance Fixes in multi-threaded

Future releases : DPM (provided by Ricardo) Package consolidation: EPEL compliance Fixes in multi-threaded clients Replace httpg with https on the SRM Improve dpm-replicate (dirs and FSs) GUIDs in DPM Synchronous GET requests Reports on usage information Quotas Accounting metrics HOT file replication 10/27/11 Vancouver, HEPIX, EMI 1. 8. 3 November 1. 8. 4 January 1. 8. 5 20 EMI INFSO-RI-261611 • • •

 • DPM Admin contrib package – Contribution from Grid. PP – Now packaged

• DPM Admin contrib package – Contribution from Grid. PP – Now packaged and distributed with the DPM components – http: //www. gridpp. ac. uk/wiki/DPM-admin-tools • Nagios monitoring plugins for DPM – Available now – https: //svnweb. cern. ch/trac/lcgdm/wiki/Dpm/Admin/Mon itoring • Puppet templates – Available now in beta – https: //svnweb. cern. ch/trac/lcgdm/wiki/Dpm/Admin/Pup pet 10/27/11 Vancouver, HEPIX, EMI 21 EMI INFSO-RI-261611 News from DPM (Administration)

10/27/11 Vancouver, HEPIX, EMI 22 EMI INFSO-RI-261611 Some news from d. Cache

10/27/11 Vancouver, HEPIX, EMI 22 EMI INFSO-RI-261611 Some news from d. Cache

Slightly modified release numbers LHC Tech. Break April 2011 2012 2. 2 1. 9.

Slightly modified release numbers LHC Tech. Break April 2011 2012 2. 2 1. 9. 14 2. 0 EMI - 2 2. 1 1. 9. 13 10/27/11 Vancouver, HEPIX, EMI 23 EMI INFSO-RI-261611 EMI - 1 1. 9. 12

More on d. Cache Some d. Cache lab secrets 20 10/27/11 Vancouver, HEPIX, EMI

More on d. Cache Some d. Cache lab secrets 20 10/27/11 Vancouver, HEPIX, EMI 24 EMI INFSO-RI-261611 But only because of

Adapting different back-ends p. NFS Web. DAV grid. FTP x. Root. D d. Cache

Adapting different back-ends p. NFS Web. DAV grid. FTP x. Root. D d. Cache Pool Data Access Abstraction Mounted File-system 10/27/11 Vancouver, HEPIX, EMI 25 EMI INFSO-RI-261611 Hadoop Object File or FS EXT 4, Store whatever XFS, GPFS ***

Pool storage abstraction o Pool data access abstraction layer allows to plug-in different storage

Pool storage abstraction o Pool data access abstraction layer allows to plug-in different storage back-ends o We start with Hadoop FS as a prove of concept ü Feature-set of d. Cache (p. NFS, Web. DAV. . ) plus ü Easy maintenance of Hadoop FS Pools might no longer be multi-purpose e. g. ü Hadoop FS not very good in random seeks. ü Object Stores might only support PUT, GET o Allows sites to migrate from Best. Man/Hadoop to d. Cache o Will try Objects Stores later. 10/27/11 Vancouver, HEPIX, EMI 26 EMI INFSO-RI-261611 o

10/27/11 Vancouver, HEPIX, EMI 27 EMI INFSO-RI-261611 The Three Tier Model

10/27/11 Vancouver, HEPIX, EMI 27 EMI INFSO-RI-261611 The Three Tier Model

The Three Tier Model (Motivation) Different storage back-ends have different properties Tape o Single

The Three Tier Model (Motivation) Different storage back-ends have different properties Tape o Single stream o Non shareable o High latency o Cheap reliable o Low power Spinning disk o Multiple stream o Medium shareable o Medium latency o Reasonable speed o Medium costs SSD o Multiple stream o Highly shareable o Low latency o Good speed o Super expensive Random access / Analysis o Many uncontrollable streams o Very low latency requirements o Chaotic seeks o Transfer speeds not that important 10/27/11 WAN Transfer / Reconstruction o Controlled/Low number of streams o Latency doesn’t matter o High transfer speeds Vancouver, HEPIX, EMI 28 EMI INFSO-RI-261611 Different protocols/applications have different requirements

The Three Tier Model SSD Spinning Disks Tape SRM/grid. FTP/WAN p. NFS Random Access

The Three Tier Model SSD Spinning Disks Tape SRM/grid. FTP/WAN p. NFS Random Access Analysis 10/27/11 SRM/grid. FTP/http WAN/streaming Vancouver, HEPIX, EMI 29 EMI INFSO-RI-261611 Will start with simulations based on log files. Precious Cached First results will. Precious be published at ISGC Or Copy (Taipei) and CHEP’ 12 Cached by Dmitry Ozerov Copy et al.

More cool stuff 10/27/11 Vancouver, HEPIX, EMI 30 EMI INFSO-RI-261611 d. Cache will come

More cool stuff 10/27/11 Vancouver, HEPIX, EMI 30 EMI INFSO-RI-261611 d. Cache will come with it’s own Web. DAV browser client. Stay tuned.

Some conclusions q EMI (DATA) is already significantly contributing to the HEP data grid

Some conclusions q EMI (DATA) is already significantly contributing to the HEP data grid … q Sustainability is now being worked on. q Industry standards are becoming available within EMI-Data q EMI builds the framework of collaboration even among natural q Go and tryout the EMI repository !!! q More info on EMI Data with all details and timelines : https: //twiki. cern. ch/twiki/bin/view/EMI/Emi. Jra 1 T 3 Data. DJRA 12 2 10/27/11 Vancouver, HEPIX, EMI 31 EMI INFSO-RI-261611 competitors (DPM, Sto. RM and DPM). Customers benefits.

Enjoy EMI is partially funded by the European Commission under Grant Agreement INFSO-RI-261611 10/27/11

Enjoy EMI is partially funded by the European Commission under Grant Agreement INFSO-RI-261611 10/27/11 Vancouver, HEPIX, EMI 32