Analysis Trains Costin Grigoras Jan Fiete GrosseOetringhaus ALICE

  • Slides: 13
Download presentation
Analysis Trains Costin Grigoras Jan Fiete Grosse-Oetringhaus ALICE Offline Week, 04. 10. 12 Analysis

Analysis Trains Costin Grigoras Jan Fiete Grosse-Oetringhaus ALICE Offline Week, 04. 10. 12 Analysis Trains - Jan Fiete Grosse-Oetringhaus 1

LEGO Trains • 42 trains configured (37 active) – 5 CF, 4 GA, 1

LEGO Trains • 42 trains configured (37 active) – 5 CF, 4 GA, 1 PP, 8 JE, 5 DQ, 11 HF, 8 LF • Submitted trains this year – 213 CF, 35 DQ, 24 GA, 124 HF, 173 JE, 114 LF, 3 PP • 1 -5 train operators / train • Operator mailing list alice-analysis-train-operators@cern. ch • TWiki page https: //twiki. cern. ch/twiki/bin/ viewauth/ALICE/Analysis. Trains Analysis Trains - Jan Fiete Grosse-Oetringhaus since 01. 02. 12 PWG Wall in Jobs [k] years CF 2007 316, 8 DQ 1129 316, 1 GA 852 140, 1 HF 3354 700, 4 JE 1071 362, 4 LF 1312 137, 5 PP 11 0, 2 on average 2400 jobs at any given time 2

Running Statistics alidaq aliprod alitrain SUM Analysis Trains - Jan Fiete Grosse-Oetringhaus 3

Running Statistics alidaq aliprod alitrain SUM Analysis Trains - Jan Fiete Grosse-Oetringhaus 3

Time until trains finish • Time between train submission and submission of final merging

Time until trains finish • Time between train submission and submission of final merging job per Train Average per month • Average below 2 days (good!) but quite some spread Analysis Trains - Jan Fiete Grosse-Oetringhaus 4

Ali. En Upgrade • The upgrade this Monday of parts to v 2 -20

Ali. En Upgrade • The upgrade this Monday of parts to v 2 -20 had a few side-effects – General interruption from 10. 00 to midnight; during this period Costin & Pablo were continuously working on fixing the situation – Jobs (in particular) merging that got submitted during that time failed, and needed to be retried later Mistake, LPM should have been disabled for the upgrade – New status FAILED which is not considered as a final state lead to some delay for merging jobs, fixed today (parallel failure of CERN EOS makes submission very slow) – Bug in SE selection, some jobs go to FAILED being fixed by Pablo at present • I propose that planned upgrades are evaluated in particular with respect to the analysis trains and a plan is made how to recover failures from/during the period Analysis Trains - Jan Fiete Grosse-Oetringhaus 5

Planned Improvements Analysis Trains - Jan Fiete Grosse-Oetringhaus 6

Planned Improvements Analysis Trains - Jan Fiete Grosse-Oetringhaus 6

Improve Merging • Merging – Dedicated CE/SE for merging (at CERN) being investigated –

Improve Merging • Merging – Dedicated CE/SE for merging (at CERN) being investigated – Merging job submission to be speeded up (at the moment dependent on number of waiting analysis jobs) • Job Splitting – Investigate new Ali. En option to select the input files once the job has started increases number of files per job (less merging, more files for event mixing) Analysis Trains - Jan Fiete Grosse-Oetringhaus 7

Train Statistics • Add consumed CPU and wall time for total and per job

Train Statistics • Add consumed CPU and wall time for total and per job in run view 2. 2 y CPU total 3. 2 y Wall total 3. 2 h CPU / job 4. 2 h wall / job 4. 7 files / job Analysis Trains - Jan Fiete Grosse-Oetringhaus 8

Dataset Selection • Allow users on the interface to indicate on which dataset they

Dataset Selection • Allow users on the interface to indicate on which dataset they would like to run – Operator marks dataset as "active" (similar to wagons) – User selects the desired datasets among those Desired datasets Analysis Trains - Jan Fiete Grosse-Oetringhaus LHC 10 h_AOD 086 LHC 11 h_AOD 095 … 9

Merging Test • Test also the merging per wagon Merging test OK Failed Analysis

Merging Test • Test also the merging per wagon Merging test OK Failed Analysis Trains - Jan Fiete Grosse-Oetringhaus 10

Further Ideas • Number of wagons • Enabling/disabling by lists (of wagon numbers /

Further Ideas • Number of wagons • Enabling/disabling by lists (of wagon numbers / names? ) • Saving / loading of train configurations • Groups of wagons • Ordering of wagons Analysis Trains - Jan Fiete Grosse-Oetringhaus 11

Demo …some new features… Analysis Trains - Jan Fiete Grosse-Oetringhaus 12

Demo …some new features… Analysis Trains - Jan Fiete Grosse-Oetringhaus 12

Summary • The LEGO train system got very popular • The average finishing time

Summary • The LEGO train system got very popular • The average finishing time of a train is 2 days, but with quite some spread • We have lots of improvements requests and ideas • We have a lack of manpower (there is only Costin and me, both with many other tasks, too) which leads sometimes to large response times Analysis Trains - Jan Fiete Grosse-Oetringhaus 13