Worldwide Data Processing with SAMGrid As experiments refine

  • Slides: 7
Download presentation
Worldwide Data Processing with SAMGrid

Worldwide Data Processing with SAMGrid

As experiments refine their understanding of raw data, a point is reached where it

As experiments refine their understanding of raw data, a point is reached where it becomes desirable to reanalyze the entire dataset with the latest techniques. For the D 0 experiment, the datasets Involved are large: ~250 TB Equivalent to a stack of CDs nearly as tall as the Eiffel Tower

Processing such large datasets in a timely manner requires large scale compute resources. A

Processing such large datasets in a timely manner requires large scale compute resources. A single pass over the full dataset will involve: Reading ~250 TB of input Writing ~ 70 TB of output Processing ~1 Billion events To complete such a pass within 6 months requires ~3. 5 THz of PIII equivalent compute capacity

SAMGrid provides an ideal platform for mustering the large scale resources needed to do

SAMGrid provides an ideal platform for mustering the large scale resources needed to do the D 0 data reprocessing with over 20 production sites located across North America, Europe, Asia and South America

More than a dozen sites worldwide were able to participate in the D 0

More than a dozen sites worldwide were able to participate in the D 0 reprocessing effort by providing a peak compute capacity of over 3. 5 THz in PIII equivalent units: CCIN 2 P 3 (Lyon) Fermilab Grid. Ka (Karlsruhe) Manchester SPRACE (Sao Paolo) West. Grid (Vancouver BC) CMS (at FNAL) FZU (Prague) Imperial (London) OSCER (Oklahoma) D 0 SAR (Texas, Arlington) Wisconsin

Essential services provided by SAMGrid: Complete meta-computing environment including Grid-level job management based on

Essential services provided by SAMGrid: Complete meta-computing environment including Grid-level job management based on Condor and Globus Delivery of executables to sites in an encapsulated compute environment suitable for operation on diverse Linux installations Delivery of raw data over WAN to remote installations Transport of output back to FNAL and storage in MSS Bookkeeping of processing history, job success/failure, and job recovery Monitoring facilities for job status, site availability, and error logging

Submission screen dump… Site screen dump… Job screen dump… Data flow (FNAL->remote->merge->FNAL) Conclusion, time

Submission screen dump… Site screen dump… Job screen dump… Data flow (FNAL->remote->merge->FNAL) Conclusion, time spent, data processed etc.