Common Application Software for the LHC Experiments CHEP

  • Slides: 26
Download presentation
Common Application Software for the LHC Experiments CHEP’ 06, Mumbai, India 13 th February

Common Application Software for the LHC Experiments CHEP’ 06, Mumbai, India 13 th February 2006 P. Mato/CERN 13/02/06 CHEP'06 Mumbai

What’s Common software? u u Our definition of “common software” is the software that

What’s Common software? u u Our definition of “common software” is the software that is used by at least two LHC experiments Nevertheless, we allow some level of speculation – Software “requested” by the experiments – Software that is going to be used by the experiments – Software that “we think” is going to be used/requested u In general, common software would be of a generic nature and non-specific to one experiment – The borderline between generic and specific is somehow arbitrary u Not all the common software needs to be developed – Most of the needed non-HEP application specific functionality is already freely available 13/02/06 CHEP'06 Mumbai 2

Software Domain Decomposition Simulation Program Event Reconstruction Program Detector Calibration Engines Generators Framework Algorithms

Software Domain Decomposition Simulation Program Event Reconstruction Program Detector Calibration Engines Generators Framework Algorithms Experiment Frameworks Persistency Data. Base Batch File. Catalog Conditions Interactive Simulation Data Management Geometry Histograms Fitters Math. Libs I/O GUI Plugin. Mgr Dictionary Interpreter Foundation Utilities OS binding 13/02/06 Analysis Program CHEP'06 Mumbai NTuple Grid Services Physics 2 D Graphics Collections 3 D Graphics Core 3

Simplified software structure Applications are built on top of frameworks and implementing the required

Simplified software structure Applications are built on top of frameworks and implementing the required algorithms Applications Exp. Framework Simulation Data Mgt. Distrib. Analysis Core Libraries non-HEP specific software packages 13/02/06 Every experiment has a framework for basic services and various specialized frameworks: event model, detector description, visualization, persistency, interactivity, simulation, etc. Specialized domains that are common among the experiments Core libraries and services that are widely used and provide basic functionality Many non-HEP libraries widely used CHEP'06 Mumbai 4

Outline of the talk u Important issues for building a large software system –

Outline of the talk u Important issues for building a large software system – Minimizing duplication – Software re-use – Integration elements u Rapid walk on the various common software domains – – u u Core libraries and services Data management and persistency framework Simulation framework Distributed analysis Software configuration, releases and installations The LCG Applications Area – Objectives, organization and main goals for 2006 u Conclusions 13/02/06 CHEP'06 Mumbai 5

Minimizing duplication u Duplication of functionality is not efficient for the long term maintenance

Minimizing duplication u Duplication of functionality is not efficient for the long term maintenance of the software – we are developing large software systems that will need to last decades, so any duplication will have a cost associated. – incompatibilities and incoherencies between the different software packages. Increase in the memory footprint. u Some level of duplication is need and wanted – unavoidable for novel functionality, during prototyping, for mission critical parts, etc. – it may stimulate healthy competition u The recent merge of the SEAL and ROOT projects in the LCG Application Area goes in this direction – provide a coherent set of products developed and maintained by the AA for the benefit of its clients – ease the long-term maintenance and evolution of a single set of software products 13/02/06 CHEP'06 Mumbai 6

Software re-use u u Encourage the use of third-party components wherever possible Many obstacles

Software re-use u u Encourage the use of third-party components wherever possible Many obstacles to overcome – too broad functionality / lack of flexibility in components – organisational - reuse requires a broad overview to ensure unified approach – cultural » don’t trust others to deliver what we need » fear of dependency on others » fail to share information with others » developers fear loss of creativity u Software re-use adds dependencies: we need to act and minimize them – select/standardize on a single package for a given functionality » e. g. a single XML parser library should be sufficient – favor in some cases embedding small fractions of third-party code over an explicit external dependency » simplification of the release procedure, distribution, etc. 13/02/06 CHEP'06 Mumbai 7

External Libraries u Select only libraries widely accepted by the community – avoid two

External Libraries u Select only libraries widely accepted by the community – avoid two libraries for the same functionality – open source and public domain software favored – multi-platform (UNIX, Windows, etc. ) u AA/SPI project manages the installation, distribution and configuration of all needed libraries in a coherent manner – – u standard structure of directories experiments and projects decides what is installed management of dependencies creation, repository and distribution of tarballs The goal is to simplify the use of external libraries – transparent towards the experiments and users 13/02/06 CHEP'06 Mumbai 8

Integration elements u u The common application software should facilitate the integration of independently

Integration elements u u The common application software should facilitate the integration of independently developed components to build a coherent application Dictionaries – Dictionaries provide meta data information (reflection) to allow introspection and interaction of objects in a generic manner – Our strategy is to evolve to a single reflection system (Reflex) u Scripting languages – Interpreted languages are ideal for rapid prototyping – They allow integration of independently developed software modules (software bus) – Standardizing on CINT and Python scripting languages u Component model and plugin management – Modeling the application as components with well defined interfaces – Loading the required functionality at runtime 13/02/06 CHEP'06 Mumbai 9

Strategic role of C++ reflexion Python CINT Root meta C++ ROOT Reflex DS Reflex

Strategic role of C++ reflexion Python CINT Root meta C++ ROOT Reflex DS Reflex • Object I/O • Scripting (CINT, Python) • Plug-in management • etc. rootcint -cint XDictcint. so rootcint -reflex X. h rootcint -gccxml 13/02/06 CHEP'06 Mumbai 10

Core libraries and services u ROOT provides basic functionality needed by any application –

Core libraries and services u ROOT provides basic functionality needed by any application – It is now at the “root” of the software for all the LHC experiments u Current work packages – BASE: Foundation and system classes, documentation and releases – DICT: Reflexion system, meta classes, CINT and Python interpreters – I/O: Basic I/O, Trees, queries – PROOF: parallel ROOT facility, xrootd – MATH: Mathematical libraries, histogramming, fitting – GUI: Graphical User interfaces and Object editors – GRAPHICS: 2 -D and 3 -D graphics – GEOM: Geometry system – SEAL: Maintenance of the existing SEAL packages 13/02/06 CHEP'06 Mumbai 11

Data Management u FILES - based on ROOT I/O – – u Targeted for

Data Management u FILES - based on ROOT I/O – – u Targeted for complex data structure: event data, analysis data Based on Reflex object dictionaries Management of object relationships: file catalogues Interface to Grid file catalogs and Grid file access Relational Databases – Oracle, My. SQL, SQLite – Suitable for conditions, calibration, alignment, detector description data possibly produced by online systems – Complex use cases and requirements, multiple ‘environments’ – difficult to be satisfied by a single solution – Isolating applications from the database implementations with a standardized relational database interface » facilitate the life of the application developers » no change in the application to run in different environments » encode “good practices” once for all – Focus moving into deployment and experiment support 13/02/06 CHEP'06 Mumbai 12

Persistency framework u The AA/POOL project is delivering a number of “products” u Object

Persistency framework u The AA/POOL project is delivering a number of “products” u Object storage and references successfully used in large scale production in ATLAS, CMS, LHCb Need to focus on database access and deployment in Grid STORAGE MGR COLLECTIONS ROOT I/O Oracle FILE CATALOG RDBMS COOL API u USER CODE – COOL – Detector conditions database POOL API – POOL – Object and references persistency framework – CORAL – Generic database access interface – ORA – Mapping C++ objects into relational database COOL CORAL My. SQLite – basically starting now 13/02/06 CHEP'06 Mumbai 13

Simulation u MC generators – MC generators specialized on different physics domains, developed by

Simulation u MC generators – MC generators specialized on different physics domains, developed by different authors – Needed to guarantee support for the LHC experiments and collaboration with the authors. u Simulation engines – Geant 4 and Fluka are well established products u Common additional utilities required by the experiments – – Interoperability between MC generators and simulation engines Interactivity, visualization and analysis facilities Geometry and Event data persistency Comparison and validation (between engines and real data) 13/02/06 CHEP'06 Mumbai 14

Simulation framework utilities u u Hep. MC: C++ Event Record for Monte Carlo Generators

Simulation framework utilities u u Hep. MC: C++ Event Record for Monte Carlo Generators GDML: Geometry description markup language – Geometry interchange format or geometry source – GDML writer and readers exists for Geant 4 and ROOT u Geant 4 Geometry persistency – Saving/retrieving Geant 4 geometries with ROOT I/O u FLUGG: using Geant 4 geometry from FLUKA – Framework for comparing simulations – Example applications have been developed u Python interface to Geant 4 – Provide Python bindings to G 4 classes – Steering Geant 4 applications from Python scripts u Utilities for MC truth handling 13/02/06 CHEP'06 Mumbai 15

Simulation components Steering Python scripts text editor W GDML R TGeom R W MC

Simulation components Steering Python scripts text editor W GDML R TGeom R W MC generators Pythia Hep. MC Geant 4 Hep. MC Flugg MCDB 13/02/06 geom. root Fluka CHEP'06 Mumbai MC truth root 16

Distributed data analysis u Full spectrum of different analysis applications will be coexisting –

Distributed data analysis u Full spectrum of different analysis applications will be coexisting – Data analysis applications using the full functionality provided by the experiment’s framework (analysis tools, databases, etc. ) » Requiring big fraction of the available software packages and very demanding on computing and I/O » Typically batch processing – Final analysis of ntuple-like data (ROOT trees) » Fast turn-around (interactive) » Easy migration from local or distributed (PROOF) u Tools to help the Physicists are being made available – Large scale grid job submission (GANGA) – Parallelization of the analysis jobs (PROOF) 13/02/06 CHEP'06 Mumbai 17

Software installations u u All AA software (external and internal) is available in /afs/cern.

Software installations u u All AA software (external and internal) is available in /afs/cern. ch/sw/lcg Organized as <package>/<version>/<platform> – The platform keyword is made of operating system, architecture and compiler version u u Tar files (sources and/or binaries) for distribution are also available More than 100 packages available – For many packages only the client-side is required u u 13/02/06 Automated installations done by a system of scripts and xml file descriptions Packages, versions are decided in Architects Forum (AF) CHEP'06 Mumbai 18

Software configuration u u u An LCG configuration is a combination of packages and

Software configuration u u u An LCG configuration is a combination of packages and versions which are coherent and compatible Configurations are given names like “LCG_40” Experiments build their application software based on a given LCG configuration – Interfaces to the experiments configuration systems are provided (SCRAM, CMT) – Concurrent configurations are everyday situation u Configurations are decided in the AF 13/02/06 CHEP'06 Mumbai 19

Software releases u The AA/Experiments software stack is quite large and complex – Many

Software releases u The AA/Experiments software stack is quite large and complex – Many steps and many teams are involved u Only 2 -3 production quality releases per year is affordable u Feedback is required before the production release is made – No clear solution on how to achieve this – Currently under discussion u As often as needed bug fix releases release order – Complete documentation, complete platform set, complete regression tests, test coverage, etc. Applications Exp. Framework Simulation Data Mgt. Distrib. Analysis Core Libraries non-HEP specific software packages – Quick reaction time and minimal time to release 13/02/06 CHEP'06 Mumbai 20

LCG Application Area Focus u u Deliver the common physics applications software for the

LCG Application Area Focus u u Deliver the common physics applications software for the LHC experiments Organized to ensure focus on real experiment needs – – – u Experiment-driven requirements and monitoring Architects in management and execution Open information flow and decision making Participation of experiment developers Frequent releases enabling iterative feedback Success is defined by adoption and validation of the products by the experiments – Integration, evaluation, successful deployment 13/02/06 CHEP'06 Mumbai 21

Applications Area Organization MB LHCC Work plans Quarterly Reports Alice Reviews Resources Atlas Chairs

Applications Area Organization MB LHCC Work plans Quarterly Reports Alice Reviews Resources Atlas Chairs AA Manager CMS LHCb Architects Forum Application Area Meeting Decisions LCG AA Projects SPI WP 2 WP 1 External Collaborations 13/02/06 ROOT POOL WP 1 WP 2 WP 1 WP 3 WP 1 ROOT CHEP'06 Mumbai WP 2 Geant 4 SIMULATION Subproject 1 EGEE 22

AA Projects u SPI – Software process infrastructure (A. Pfeiffer) – Software and development

AA Projects u SPI – Software process infrastructure (A. Pfeiffer) – Software and development services: external libraries, savannah, software distribution, support for build, test, QA, etc. u ROOT – Core Libraries and Services (R. Brun) – Foundation class libraries, math libraries, framework services, dictionaries, scripting, GUI, graphics, SEAL libraries, etc. u POOL – Persistency Framework (D. Duellmann) – Storage manager, file catalogs, event collections, relational access layer, conditions database, etc. u SIMU - Simulation project (G. Cosmo) – Simulation framework, physics validation studies, MC event generators, Garfield, participation in Geant 4 and Fluka. 13/02/06 CHEP'06 Mumbai 23

AA presentations to this conference u SPI – Software process infrastructure u ROOT –

AA presentations to this conference u SPI – Software process infrastructure u ROOT – Core Libraries and Services u POOL – Persistency Framework u SIMU - Simulation project – [229] The LCG SPI project in LCG Phase II – – – – [114] Recent Developments in the ROOT I/O and TTrees [185] Reflex, reflection for C++ [227] New Developments of ROOT Mathematical Software Libraries [383] New features in ROOT geometry modeller for representing non-ideal geometries [93] ROOT 3 D graphics [187] ROOT GUI, General Status [98] PROOF - The Parallel ROOT Facility – – [329] CORAL, a software system for vendor-neutral access to relational databases [337] COOL Development and Deployment - Status and Plans – – – [371] Geant 4 in production: status and developments [257] Recent developments and upgrades to the Geant 4 geometry modeler [259] Geometry Description Markup Language and its application-specific bindings [300] Geant 4 Acceptance Suite for Key Observables [432] Development, validation and maintenance of MC generators & generator services in the LHC era 13/02/06 CHEP'06 Mumbai 24

Main AA Goals for 2006 u u u Seek confirmation that ALL products developed

Main AA Goals for 2006 u u u Seek confirmation that ALL products developed in AA are really going to used by the LHC collaborations Pay special attention to recent requirements, problems, worries, wishes, etc. from the LHC experiments Develop strong links with the LHC experiments – Each AA developer should be in close contact with at least one LHC experiment u Improve overall time to release – Production and Bug fix releases – Experiments feedback before production releases u u Finish the migration of the SEAL functionality Better documentation (Web) 13/02/06 CHEP'06 Mumbai 25

Conclusions u AA is providing common software to all LHC experiments – The software

Conclusions u AA is providing common software to all LHC experiments – The software has already been used by the LHC experiments for several years – Currently consolidating a number of key products u The LCG AA has recently entered phase II – A number of changes and adaptations are being implemented – The structure has been adapted to provide better support to experiments u Establishing the level of long-term support that is required for the products that are essential for the experiments – Minimizing duplication – Re-using software and infrastructure across projects – Easing maintenance of AA software at the end of the LCG project 13/02/06 CHEP'06 Mumbai 26