ILD Computing needs Akiya Miyamoto ILD Ichinoseki Meeting

  • Slides: 12
Download presentation
ILD Computing needs Akiya Miyamoto ILD Ichinoseki Meeting 21 Feburary 2018

ILD Computing needs Akiya Miyamoto ILD Ichinoseki Meeting 21 Feburary 2018

Introduction n ILD LOI and DBD did not describe the computing cost, because u

Introduction n ILD LOI and DBD did not describe the computing cost, because u Hardware performance would improve u Software evolution would require more resources n TDR u A building for the data analysis was included. Base on the estimation in LOI era. FNAL LHC facility as a bases. u Computing hardware itself, human resources for operation, network were not included n After DBD, LCC “Yamada” committee made a request u ILD Study: presented at ILD Workshop 2014. H 20 Scenario, DBD experience u LCC Software and Computing WG report http: //www. linearcollider. org/P-D/Working-groups n Last fall, a request by LCC and KEK u Preparation for a query about cost and infrastructure required. u 250 Ge. V staging in mind. 2018/02/21 ILD Computing needs 2

Computing concept ILC Lab. Detector IP Campus Main Campus GRID World n Role of

Computing concept ILC Lab. Detector IP Campus Main Campus GRID World n Role of each computing facility: u IP Campus : Event building, Fast Data Monitor u Main Campus : Data storage, Event(BX) selection, Quick data analysis u GRID Computing: Secondary Data Analysis, User Analysis, Simulation n Computing at IP Campus: DAQ of the experimental group n Role of the computing in ILC Lab. Main Campus u Trigger less readout. Remove background data at early stage of analysis u Share among ILD and Si. D. Lab. wide uniform support of mail, security, … u Follows the past tradition, basic resource being supported by the lab. as a part of running cost. 2018/02/21 ILD Computing needs 3

Bases of estimation: ILD raw data size in TDR ( @500 Ge. V) raw

Bases of estimation: ILD raw data size in TDR ( @500 Ge. V) raw data size per train estimated @ 500 Ge. V 2014 VXD : ~ 100 MB Beam. Cal : 126 MB reduced to 5% = 6 MB Others < 40 MB Dominated by low. E e+/e- background due to beamstrahlung 130 Total data size : ~180 MB/Train, ~0. 9 GB/sec, ~7. 1 PB/year(0. 8 x 107 sec) ~277 MB/Train, ~1. 4 GB/sec, ~11. 1 PB/year

Storage estimation (ILD) n Running scenario: 1 st stage 250 Ge. V. Total int.

Storage estimation (ILD) n Running scenario: 1 st stage 250 Ge. V. Total int. lumi H 20 n Raw data size: n TDR 500 Ge. V with AHCAL corr. : ~ 11 PB/year n 250 Ge. V nominal : Same as 500 Ge. V (bkg would be similar ) n Run with x 2 luminosity: x 2 of nominal n 2 raw data set: one set at Lab, another set somewhere in the world. n Filtered/analyzed data. u A fraction of signal BX ( DBD signal samples + ) : ~ 1%. Assume 3% of BXs remains after filtering. u Event size per BX would be x 2 after filtering and initial analysis ( REC/SIM ratio of DBD samples ) u After reanalysis on GRID, event size would be x 3 of raw event. u DST files would be replicated to 10 sites world side. n Simulation data u Produce x 10 luminosity than real data on GRID u Event data size : adapt DBD data size. 2018/02/21 ILD Computing needs 5

CPU needs n MC Simulation (on GRID) u x 10 real data statistics u

CPU needs n MC Simulation (on GRID) u x 10 real data statistics u CPU time: DBD signal + bhabha etc + Reconstruction Assume bhabha etc = DBD signal reconstruction = 0. 5 x DBD signal sim. n Real data processing: u Data filtering: all BXs, same CPU time as data reconstruction Major part of CPU demands u Reconstruction : Filtered event ( 3 % of all BXs ). Same CPU time as Sim. u CPU capacity enough to analyze 1 year of data in 240 days. u Another reconstruction after re-calibration, on GRID n User analysis, detector calibration, are not counted. 2018/02/21 ILD Computing needs 6

Year by Year evolution Annual Integrated Luminosity (fb-1) Annual Integrated Luminosity (ab-1) n Assume

Year by Year evolution Annual Integrated Luminosity (fb-1) Annual Integrated Luminosity (ab-1) n Assume Si. D = ILD n Luminosity ramp up scenario is included. n Only on site estimation is shown below. 250 Ge. V 2018/02/21 250 Ge. V 2 x. Lumi 500 Ge. V 350 Ge. V 500 Ge. V 2 x. Lumi ILD Computing needs 7

Computing cost in Lab n Assumption n Rental. 4 year service + 0. 5

Computing cost in Lab n Assumption n Rental. 4 year service + 0. 5 year for replacement n 10% of year 0 system, before year 0 n Additional cost to be added to the TDR value would be u Hardware to support ILD and Si. D needs at Lab. l Includes CPU, Tape robot, disk, software, UPS, cooling. Tape media not included. l Base on KEKCC 2017, assume cost reduction of CPU(2%/y), Storage(10%/y) u Network and human resources for operation of scientific system u Support by running cost n A building space for the computing system ~ a space in TDR. 2018/02/21 ILD Computing needs 8

Summary n Computing resources for ILD data analysis was revised. n The estimated resources

Summary n Computing resources for ILD data analysis was revised. n The estimated resources at the lab was used to estimate a building space and an operational cost for the lab. n The estimation is based on many assumptions. - Raw data size with the latest beam parameters ? - Efficiency to remove backgrounds ? - CPU time for background removal ? Timely update of these estimation is desirable. 2018/02/21 ILD Computing needs 9

BACKUP

BACKUP

A model of ILD data processing ILD F. E. Online Computer @ control room

A model of ILD data processing ILD F. E. Online Computer @ control room 1. build a train data 2. send data to Main Computer and monitor processes 1. Data sample and reconstruction for monitoring 2. Temporary data storage for emergency Main Computer @ main campus ~1 GB/sec Temp. Storage DAQ/ Online Offline

DAQ/ Online Offline ~1 GB/sec Main Computer @ main campus Online reconstruction chain 1.

DAQ/ Online Offline ~1 GB/sec Main Computer @ main campus Online reconstruction chain 1. 2. 3. 4. 5. 6. 7. Write data sub-detector based preliminary reconstruction identify bunches of interest calibration and alignment background hit rejection full event reconstruction event classification Online Processed Data (OPD) Raw Data (RD) Fast Physics Data (FPD) Calibration Data (CD) GRID based Offline computing JOB-A Re-processing with better constants Offline Reconstructed Data (ORD) JOB-B Produce condensed data sample DST JOB-C MC-Production MC data Raw Data Copy