ATLAS software workshop at BNL 23 28 May
ATLAS software workshop at BNL 23 -28 May 2004 Summary of notes taken during talks and some slides taken from interesting talks – Ricardo, 29 May 2004
Outline Overall notes and plans Core software, distribution kit, etc Simulation and detector description Reconstruction and event selection Analysis tools and Event Data Model Combined Test Beam
Overall notes and plans
Many different sessions and working group meetings…will try to summarize and will focus on some subjects more than others. Also, most of this talk is a transcription of my notes, not all of it makes sense of is very explicit. • • • Software plenary sessions I & II Event selection, reconstruction and analysis tools Software infrastructure Simulation and detector description Data challenge Frameworking group Grid and distributed analysis Event Data Model (EDM) working group SUSY working group Reconstruction working group Physics coordination Physics Event Selection Algorithms (PESA) working group • • • Grid working group Physics validation Detector description working group Distributed analysis working group Database and calibration/alignment Combined performance International computing board Analysis tools working group Database working group Software distribution and deployment working group Calibration/alignment working group Software plenary session III
• Major production activities soon: DC 2, Combined Test Beam • Also: HLT testbed and physics studies leading up to Physics Workshop 2005 • Next software week 20 -24 September: after DC 2 and at end of beam test • Documentation is only in the minds of a few people – some plans to document parts of software during the next year • Several buzzwords in this meeting: Ø Ø Ø Geant 4 Pileup Digitization event mixing EDM: ESD/AOD (what information should be in AOD? 100 k. B max) Use of Python job options (no more athena. exe in > 8. 2. 0, use job. Options_XXX. py) Ø Grid Ø Calibration etc….
ATLAS Computing Timeline 2003 • POOL/SEAL release (done) • ATLAS release 7 (with POOL persistency) (done) • LCG-1 deployment (done) 2004 • ATLAS complete Geant 4 validation (done) NOW • ATLAS release 8 (done) • DC 2 Phase 1: simulation production 2005 • DC 2 Phase 2: intensive reconstruction (the real challenge!) • Combined test beams (barrel wedge) • Computing Model paper 2006 • Computing Memorandum of Understanding • ATLAS Computing TDR and LCG TDR • DC 3: produce data for PRR and test LCG-n 2007 • Physics Readiness Report • Start commissioning run • GO!
Planning (T. Le. Compte) • For the first time delay in schedule is consistent with zero • Several project reviews expected soon • Inner Detector and Core software seem to need lots of work and people
Core software, distribution, validation, etc
Core software Release plans: 8. 2. 0 – 22 May 8. 3. 0 – 9 June Core software (C. Leggett) 9. 0. 0 – 30 June – DC 2 production • Gaudi: • Current version is v 14 r 5+ATLAS-specific patches (ATLAS version 0. 14. 6. 1) • Changes for v 14: – – Uses SEAL, POOL, PI (? ) AIDA histogram Svc replaced with ROOT Gaudi. Python Events merging: can now control exactly which events from 2 files are merged – Pileup: see Davide's talk Tuesday – Interval of Validity (Io. V) Svc improved – v 8. 2. 0: new version of CLHEP causes wrong version of Hep. MC to load when using athena. exe (not athena. py) -> segfault • Athena and Gaudi: heading towards rel. 9, Gaudi v 15 (try to stay in step with LHCb); extend install. Area to gaudi
Distribution kit • Development platform vs. deployment platform • Kit philosophy: – Address multiple clients: running binaries/doing development – Possibility of downloading partial releases, binaries only (Gregory working on this one), source code… – Multiple platforms (not yet) • Until recently only Red. Hat 7. 3 + gcc 3. 2 supported – in future, CERN Enterprise Linux (CEL) 3 + gcc 3. 2. 3 / icc 8 (64 -bit compiler) / Windows / Mac. OSX • Complication in development platforms from external software – need to cleanup packages and think about dependencies • Runtime environment has some problems – not easy to setup for partial distributions • Pacman/Kit tutorial to be organized soon
Validation (D. Costanzo) • 3 activities: - reconstruction of DC events with old zebra events - validation of generators for DC 2 - reconstruction of G 4 events for DC 2 • Plan for next year: generate more events and prepare for physics workshop 2005 • v 8. 2. 0 should be next usable release (moore/muid? ), exercise reconstruction • Various details: – – – Some degeneration of reconstruction was found wrt 7. 0. 2 Atlfast needed to validate generated event Several problems found e. g. x. Kalman in 8. 0. 3 A lot of activity in electron reconstruction Jet/ETmiss: jet. Rec under restructuring, H 1 -style weights need to be recalculated for Geant 4 – DC 2 going into production mode
Grid infrastructure (Luc Goosens, et al. ) • production system architecture (2) – executors • one for each facility flavour – LCG (lexor), NG (dulcinea), GRID 3 (Capone), PBS, LSF, BQS, Condor? , … • translates facility neutral job definition into facility specific language – XRSL, JDL, wrapper scripts, … • implements facility neutral interface – usual methods: submit, get. Status, kill, … – data management system • allows global cataloguing of files – we have opted to interface to existing replica catalog flavours • allows global file movement – an atlas job can get/put a file anywhere • presents a uniform interface on top of all the facility native data management tools • implementation -> Don Quijote – see separate talk
Data Management (database): Don Quixote Supervisor: Windmill Executors: NG (dulcinea), LCG (lexor), GRID 3 (Capone), batch dms Don Quixote prod. DB super jabber super LCG exe LSF exe Executors LSF Grids RLS Grid 3 Windmill jabber G 3 exe RLS NG super jabber NG exe RLS LCG super jabber LCG exe super
DC 2 running (S. Albrand) • DC 2 will have a much more sophisticated production system than DC 1 and uses POOL instead of ZEBRA • Only one database for all data types; data type is stored in database • Users add properties to their datasets (collection of logical files) from a predefined list of possible properties • Allows database searches for datasets • Dataset names proposed to user (mods. possible) • User interface to submit tasks • Datasets not owned by one physics group alone anymore; notion of principal physics group • Possibility to extract parent and children datasets: parent may be generator level dataset and children may be reconstructed dataset
Simulation and Detector Description
Detector description (J. Boudreau) • The detector description in “Geo. Model” will be used in both simulation and reconstruction • LAr is last detector missing in Geo. Model • Tile. Cal has finished Geo. Model description but will not be available for DC 2 • Volumes know their subvolumes: calculate dead material etc in transparent way • Relational database (Oracle) will contain versions of whole detector: no hardwired detector description numbers in reconstruction code • Changing things such as alignment will be dealt with through different versions of detector description in database • Great live demo of Geo. Model!
Simulation (A. Rimoldi) • Generators in advanced stage in both Atlfast and Atlsim: added “Cascade” and “Jimmy”; some differences between G 3 and G 4 Herwig events • G 4 atlas being debugged: use versions > 6. 0; new Geant 4 release 6. 1 (25/3/2004) • Digitization: most infrastructure software now in place, but work to do for each subdetector • Pileup: under development and test; different subdetectors can be affected by different bunch crossings • Combined test beam: general infrastructure is ready
G 4 ATLAS status (A. Dell'Acqua) • Concentrating on DC 2 production • G 4 v 6. 1 gave some problems, went back to v 6. 0 • 6. 0 has some physics problems (which don't seem to be serious from his tone), but aiming for rubustness at the cost of some accuracy • MC truth: fully deployed with rel. 8. 0. 2, full support in v 8. 0. 3 • Need guinea pigs to test MC truth and for simulation and reconstruction • G 4 truth uses same Hep. MC format as generated event • Migration to python job options
Digitization and Pileup (D. Costanzo) • Sub-detector breakdown: ID getting there, calorimetry very mature since DC 1, muons not so • Pileup: each detector reads pileup from different bunch crossings • Number of files in POOL should be kept low: many files leads to mem. leaks! • Disk usage explodes with use of pileup. most of this is MC truth for backgr. event • Which doesn't have to be written, will be eliminated • Memory leak problems
Detector Simulation Conclusions • • • G 4 atlas (A. Rimoldi): Work on digitization and pileup Full deployment of MCtruth Migration to Python job options Major changes in Atlas. G 4 Sim to converge with atlas # human resources problem (Armin) • • • Digitization and Pileup (D. Costanzo) Detector folders each detector reads hits from different bunch crossings Emphasis moved to detector description in Geo. Model Inner Detector: Noise was implemented with no hits from pixel/SCT LAr calorimeter: Improvements wrt G 3; use of Geo. Model Still bugs and calibration issues in G 3/G 4 Muon spectrometer: migration to Geo. Model done in the last few days Effort put in validation of hit positions Detailed simulation of MDT digitization and response Realistic pileup procedure still needs work Setup for combined test beam in place (or almost) • • •
Analysis Tools
Analysis tools (K. Assamagan) • • • RTF recommendations: looked at modularity, granularity & design of reconstruction software Analysis. Tools: span gap between reconstruction and ntuple analysis Tools: Artemis analysis framework prototype – Seems to diverge from EDM, may not be supported for long PID - prototype to handle part. identif. Workshop in april at UCL: http: //www. usatlas. bnl. gov/PAT/ucl_ workshop_sumary. pdf - Py. Root, Py. LCGDict - Physicists Interfaces (PI) project: – extends AIDA; provides services for batch analysis in C++, fitting and minimization, storage in HBOOK/ROOT/xml, – plotting in ROOT/Hippodraw/Open. Scientist, etc. data Data flow • algorithms
Python scripting language (W. Lavrijsen) • Gaudi. Python: provides binding to Athena core objects; basis for job options after release 8. 2. 0 • Py. ROOT: bridge between Python and ROOT; distributed with ROOT 4; http: //root. cern. ch/root/Howto. Py. ROOT. html • Py. Bus: software bus (modules can be “plugged in” to bus) implemented in Python; http: //cern. ch/wlav/pybus • ASK (Athena Startup Kit): DC 2 -inspired tutorial online; full chain of algorithms (generators =>. . . => simulation =>. . . => analysis); http: //cern. ch/wlav/athena/athask/tutorials. html
Analysis with Py. ROOT (S. Snyder) • Why? Because ROOT C++ not reliable enough, python much better and as good if • No speed problems found • Py. ROOT now part of ROOT release • Can use pyroot to interface own code (or any code that cint would do) • Py. LCGDict, does the same thing with different data dictionary (data definition). When to use? • If you have external code that already has ROOT/LCG dictionary that helps to decide • Py. Root has less dependencies
Reconstruction, Trigger and Event Data Model
Reconstruction (D. Rousseau) • RTF recommendation: Detectors (e. g. Tile. Cal and LAr) should share code to facilitate downstream algorithms • Combined beam test: analyse CBT data with offline code, as little dedicated code as possible; big effort e. g. to integrate conditions database for various detectors • DC 2 reconstruction (release 9. x. x ) • Run on Geant 4 data with new detector description; validate G 4 • Persistence: – ESD (event summary data), EDM output: issues with ~200 k cal cells; target size 100 k. B/ev – Ongoing discussions on AOD (analysis object data) definition: aim for 10 k. B/ev
Reconstruction (D. Rousseau) • Work model: Reconstruction Combined ntuple ROOT/PAW analysis – Changes to: reconstr ESD/AOD analysis in Athena small ntuple ROOT – CBNT remains as debugging tool but will not be produced in large scale for DC 2 • Status: – Python job options (no more job. Options_xxx. txt) – People needed for transverse tasks: documentation, offline/CBNT reconstruction integration, AOD/ESD definition
Calorimeter reconstruction (P. Loch) • CALO EDM has navigable classes Calo. Cell, Calo. Tower and Calo. Cluster using consistent 4 -momentum representations (INavigable 4 Momentum), v 8. 1. 0); can now be used directly by Jet. Rec • Container class Calo. Cell. Container holds both LAr and Tile. Calo. Cells and persistifies them in Store. Gate (8. 2. 0, key “All. Calo”); used by LAr. Cell. Rec, LAr. Cluster. Rec, Tile. Rec. Algs explicitly • Clusters produced by Calo. Cluster. Maker (Sven Menke, v 8. 1. 0, topological and sliding window clusters) have full 3 D neighbours option, crossing boundaries between calorimeters (8. 2. 0) • Cluster splitter with 3 D clusters spanning different calorimeters under test (aim for 8. 3. 0) – finds individual showers (peaks) in large connected cluster
Calorimeter reconstruction (P. Loch) • New structure for algorithm class Calo. Tower. Algorithm – calls different builders for towers according to calorimeter – makes older FCAL minicells obsolete • CALO algorithm structure slightly behind EDM: new Calo. Cell. Maker (David Rousseau) to be tested – makes cells also calls cell-type corrections (>10 corrections for LAr) – aim for 8. 3. 0, needed in 9. 0. 0 • No hardwired numbers in code anymore, detector description/job options (Database? Job options? ) • Implement relations between cells and clusters for 9. 0. 0 (using STL maps? ) – Proposal to have classes such as “particle. With. Track” to implement relations was rejected • Asked for volunteers for design, implementation and testing of both the calorimeter EDM and the reconstruction algorithms
Tracking (E. Moyse) Many recent developments: • new tracking class; converters from old formats • very good Doxygen documentation available from the ID software homepage • A lot of reorganization and new packages recently • track extrapolation for ID - Dmitri Emeliyakov • DC 2 will be based on i. Pat. Rec and x. Kalman • there will be manual for EDM and utility packages for v 9. 0. 0
Track: Overview • New interface for Track, shown above. • Track. State. On. Surface - provides a way of iterating through hits and scatterers on the track. It contains pointers to: – – RIO_On. Track. Parameter Fit. Quality. On. Surface Scatteringangle. On. Surface • Summary – “old” summary object still in Track in 8. 2. 0. Could not be removed without changing interface (see later slide)
Track. Particle: Overview • Why do we need Track. Particle? • Need lightweight object for analysis, providing momentum • Need to transform parameters from detector to physics frame • Provides navigation
Muon Reconstruction (S. Goldfarb) • Packages Moore and Muonboy (became Muon. Box) • Moore: moved to Geo. Model, reconstructs G 4 data, DC 2 development • Muon. Boy: G 4 reco expected shortly, not using Geo. Model, development for testbeam • Common features: unit migration now validated • Efficiency in eta is now perfect (features reported in SUSY full. sim. paper are gone) • Combined reconstruction: Staco (now ported to Athena, will accept all types of tracks • being prepared for new EDM) • Mu. ID: low p. T muon development using tilecal, etc • Track task force
• • Discussions on EDM/ESD/AOD Data flow is: Reconstruction ESD (100 k. B) AOD (10 k. B) User code ntuples Meeting at UCL in April: document with conclusions at: http: //www. usatlas. bnl. gov/PAT/ucl_workshop_sumary. pdf • • • Discussion: Proposal for a class of “Identified. Particle” which could be lepton, tagged jet etc Proposal was rejected, it seemed to need either a very complicated or a redundant implementation to be sufficiently general Discussion on e. g. e/gamma ID: – egamma. Builder - high-p. T electron ID – softe. Gamma. Builder: better for low p. T/non-isol. electrons but much overlap • • • Both collections must be kept, but balance must be found in similar matters due to AOD size restrictions (aim for 10 k. B/ev. ) Calo. Cells (200, 000! Cannot all be kept!): can keep cells above noise plus sum of all cells (critical for ETmiss) Similar issues for tracking, muons, etc Conclusions: Keep more than one collection of , e. g. electrons, introduce “event view” to help choose between candidates Calo. Cell. Container: a technical solution in sight but worries about cell removal, needs study
Reference frames (Richard Hawkins) Need for more than one frame ? • Global frame is defined in ATL-GE-QA-2041 • Easier to do reconstruction if we have several subdetector-specific frames? • Boosted frames? For example, such that sum of p. T_beam=0, to correct for beam tilt (10 -4 rad, but p_beam=7 Te. V) • Q&A • Markus Elsing - one global frame must be used for reconstruction. Also global frame should be determined by inner detector frame, if possible. Problem otherwise when using lookup tables for subdetector position • Various - beam tilt should be corrected for if possible/necessary and size of effect estimated; may be done in Monte Carlo • Conclusions: • Strive to use only one global frame • Beam tilt should be taken care of in the simulation, same as the vertex smearing
PESA (Simon) • Several technical matters, software automatic testing; move from development to production phase (stable, user-friendly code etc) • Discussion on forced acceptance: a fraction of the events must be kept regardless of trigger acceptance -> studies of noise, background, trigger efficiency etc • Discussion on how this should be implemented: finegrained (according to LVL 1 flags), global (global % of bunch crossings), etc • Rather technical reports on status of e/gamma, LAr and muon slices
Example: e/gamma slice Trig. Id. Scan 7. 0. 2 had efficiency problem in both Trig. Id. Scan and Si. Track, now solved Mean effic. : 94% Mean effic. : 91% 7. 0. 2 h h Mean effic. : 95% Mean effic. : 96% 8. 0. 0 -3 2 1 0 1 2 Si. Track 8. 0. 0 3 h Single electrons + pile-up 2 x 1033 -3 2 1 0 1 2 h 3 Single electrons + pile-up 1034
egamma Workshop Several topics have come up which need more discussion: • Combined testbeam • How to integrate what we will learn there? • Reconstruction • what is done in clustering, what in egamma & electron/photon id? • Calibration/Corrections • what is specific to e and gamma and how to? • G 4/G 3 • Geometry use • Physics Studies • Zee etc • Validation of Releases Difficult to find time to discuss all of this in the usual weeks Date : Mon-Tue, June 28 -29 (tbc) Place: LPNHE Paris (Fred Derue)
Several combined performance studies… ? Energy (Me. V) E = 20 Ge. V • Discrepancy found between G 3 and G 4 – see K. Benslama, Electron Study • Apparently already explained from difference G 3 and G 4 (“Efield effect” not simulated in G 3) – see G. Unal, LAr calibration • Below: 50 Ge. V electrons vs. eta Transition at h=0. 8 Crack barrel/end-cap eta
Jet. Rec (P. Loch) • First look at new jets • Kt for towers seems to be slower than used to be: found some long standing bugs • Most jets in forward direction are extremely low in transverse energy apply cuts on jet finder input (towers, so far) • Number of jets in calorimeter and MC truth is very comparable if no cut on Et of input cells: Et > 100/200/500 Me. V (very low Et cuts!!!) • As soon as cuts are applied, basically all distributions (multiplicity, jet shapes, radius, etc become different from truth (see next slide) • • Verify hadronic calibration Invited contributions of new jet algorithms if needed Physics groups should use jet finders and give feedback Preparing extensive documentation for 8. 3. 0
Number of Kt jets in calorimeter and MC truth: ok! MC truth Tower jets
Up date on vertexing II • Vertexing tools work quite stable by now - Billoir Fast. Fit method heavily in use - Slow. Fit method: first use case by B-physics people in Artemis (J. Catmore) • In. Det. Pri. Vx. Finder package is the standard primary vertex finder in Athena reconstruction • Several clients for Vx. Billoir. Tools already (Bphysics group, several people from Bonn University, b-tagging, …) Andreas Wildauer
Results of In. Det. Pri. Vx. Finder H→uu and H→bb with vertex constraint, Geant 4 H →bb (0. , 0. )±(0. 015, 56) mm, 4800 events in total 12 µm 32 µm 13 µm 36 µm Andreas Wildauer
Results on QCD di-jets events [from D. Cavalli] OLD H 1 Calibration - Athens results NEW H 1 Calibration from Missing. ET : improves proportionality curve
Combined Test Beam
Combined Test Beam (A. Farilla) • Immediate plans: • Finalize and validate first version of simulation package • Finalize first version of reconstruction with ESD and CBNT output • Start plugging Combined Reconstruction algorithms into Rec. Ex. TB • Finalize code access to Conditions Database • Develop basic analysis framework for reconstructed data Looking for new people from the CTB • Work in progress for a community to add algorithms from combined event display in combined reconstruction into Rec. Ex. TB Atlantis
That’s it
Acronyms • • • PESA – Physics event selection algorithms ESRAT – Event Selection, Reconstruction and Analysis Tools EDM – Event Data Model ID – Inner Detector CBT (CTB) – Combined Beam Test STL – Standard Template Library SEAL – Shared Environment for Applications at LHC (http: //cern. ch/seal/) AIDA – Abstract Interface for Data Analysis LCG – LHC Computing Grid (http: //lcg. web. cern. ch) PI – Physicist Interface (extends AIDA, part of LCG) POOL – Pool Of persistent Objects for Lhc (part of LCG)
- Slides: 49