A Baseline for CMS Central Software architecture framework
A Baseline for CMS Central Software architecture, framework and services in simulation, high level trigger, reconstruction and analysis Vincenzo Innocente CERN/EP/CMC 9/29/2020 Internal Review of CMS Software and Computing 19 September 2002, CERN
Architecture Framework Toolkit Tasks Architecture Generic Framework specializations u Fast Simulation u Detector Description u Reconstruction u Analysis u Visualization Utility Toolkits Graphics tools Technology tracking and evaluation Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
This Talk What happened since last internal review u Exit Objectivity u New LHC schedule u LCG CMS Software Architecture u Overview u LCG “Blueprint” u Baseline Highlight on recent developments u Prototype for a new Persistency u Simulation Framework u Detector Description Data-Base u Visualization u Technology Tracking Central Software Vincenzo Innocente, CERN/EP Solution Internal Review of CMS Software and Computing 19 September 2002, CERN
2002: A transition year CMS Software development plans affected by u CERN budget crisis u LHC delay u New “LHC Computing Grid” project u Choices and decisions of CERN IT u Person-power shortage The only possible plan was planning for changes New Strategy u Terminate autonomous large scale R&D è Contribute u Concentrate to LCG on CMS specific issues è Consolidate current software to finalize DAQ-TDR è Initiate prototyping for a new persistency solution è Continue review of architecture and implementation u Plan for software baseline end 2003 Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Baselining Architecture Framework Toolkit Define and prototype the initial production software for LHC operation u Review Architecture u Choose products u Prototype and implement middleware u Implement framework and toolkit Needs collaboration (or at least synchronization) with LCG. CMS Strategy: u Proactive è Define u Early collaboration within LCG projects priorities, inject CMS person-power in critical areas evaluation and integration è Get a new working software already in the next months u Mitigate risks in areas where convergence seems slow è using a more flexible architecture è making our own choices and implementing ad-interim CMS-specific interfaces and abstractions Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
CCS role in LCG era Basic Software components will be provided by LCG as u u Supported External software (mainly public domain) In house developed components Part of a larger effort (HEP or wider) è Specific to LHC è The main role of CMS CCS u u Directly contribute to LCG projects Develop specific CMS software to integrate, interface, extend the components provided by LCG Provide a coherent and consistent framework è Hide “native” API and UI if conflicting or not conforming è Ensure transparency to Physics and Reconstruction Software è New Plan u u u 9/02 Prototype (confirmed for october) 6/03 Match current functionality and performance 9/03 Baseline CMS CCS Software Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
CMS Contribution to LCG Project Participations in RTAGs at the highest level of responsibility and technical competence u First experience very positive Collaborative, friendly atmosphere è Real effort to define common products è Direct involvement in the project u u u CMS “Chief Architect” in PEB & Architect Forum Already one person working full time in persistency project Ready to contribute with > 50% of AFT person-power Full commitment to use products of common LCG projects Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Architecture (Framework) Review Current implementation, design and foundations hit by: u u Objectivity fate in HEP GRID claims for responsibility Unclear future in CMS of Python, Lizard, SWIG, ROOT, CINT Obsolescence of many basic components (some chosen before 1998) Many fundamental issues reopened: u u u u Loose or tight integration with LCG persistency Sharing of responsibility with GRID Meta. Data: how, where, who Which UI, which GUI Scripting: yes, no, at which level C++ interpreter OR compiled and dynamically-loaded code Integration with build system and running environment This very issues are addressed by the Blueprint RTAG whose report is due in few days. We can anticipate that CMS endorses it. Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Build on Current Success of 2002 Data Challenge (Spring production) u Distributed production è Simple installation è Smooth running è Easy crash-recovery è Easy problem-detection è Flexible “data product” selection u Distributed analysis è Fast data validation and publishing è Data and software distribution working from Tier 1 down to Tier 5 (portable PC) è Efficient selection and cloning mechanism at sub-event level è Direct contribution of many physicists to ORCA software Thanks (in spite of) COBRA , Objectivity/DB, ORCA and software-release and data-production tools CMS New software should provide at least the same functionality and performance to start with Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Architecture Overview Data Browser Generic analysis Tools Analysis job wizards Detector/Event Display Federation wizards GRID Objy ORCA COBRA tools OSCAR FAMOS CMS tools Software development and installation Central Software Vincenzo Innocente, CERN/EP Consistent User Interface Distributed Data Store & Computing Infrastructure Coherent set of basic tools and mechanisms Internal Review of CMS Software and Computing 19 September 2002, CERN
Comp. Sci Analogy FTP TELNET SSH Central Software Vincenzo Innocente, CERN/EP File System Socket API Distributed Data Store & Computing Infrastructure Posix API TCP/IP protocol Internal Review of CMS Software and Computing 19 September 2002, CERN
Component Architecture Framework Layering Specific Frameworks Event Filter LCG Generic Application Framework Grid-Uploadable Physics modules Reconstruction Algorithms Physics Analysis Calibration Objects Configuration Event Objects Data Monitoring (Grid-aware) Data-Products Adapters and Extensions Basic Services Central Software Vincenzo Innocente, CERN/EP (O/R) DBMS GEANT 3/4 CLHEP PAW Replacement C++ Standard Library + Extension Toolkits Internal Review of CMS Software and Computing 19 September 2002, CERN
Blueprint Architectural Elements Types usable in public interfaces Object dictionary Object bus (whiteboard) Component bus (Plug-in manager) Scripting language (ROOTCINT and Python both available) Component configuration Basic framework services u u u Framework infrastructures: creation of objects (factories), lifetime, multiplicity and scope (singleton, multiton, smart pointers), communication & discovery (eg. registries), … Core services: Component management, incident management, monitoring & reporting, GUI manager, exception handling, … System services Foundation and utility libraries Use of external software Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Blueprint Domain Decomposition Red: Agreed to be common: project started Brown: at least partly foreseen as common project Grey: not foreseen as common project Object dictionary and object model Persistency and data management Event processing framework Event model Event generation Detector simulation Detector geometry and materials Trigger/DAQ Event reconstruction Detector calibration Central Software Vincenzo Innocente, CERN/EP Scripting, interpreter GUI toolkits Graphics Analysis tools Math libraries and statistics Job management Core services Foundation and utility libraries Grid middleware interfaces Internal Review of CMS Software and Computing 19 September 2002, CERN
Blueprint architecture design precepts Software structure: foundation; basic framework; specialized frameworks Component model: APIs, collaboration (‘master/slave’, ‘peer-topeer’), physical/logical module granularity, plug-ins, abstract interfaces, composition vs. inheritance, … Service model: Uniform, flexible access to functionality Object models: dumb vs. smart, enforced policies with run-time checking, clear and bullet-proof ownership model Distributed operation Global objects Dependencies Interface to external components: generic adapters Exception handling … Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Do you remember ZEBRA? ZEBRA was essentially obsolete already at the start-up of LEP It has been absolutely impossible to replace it (even reengineer it) with a modern dynamic memory manager because of the way it was designed and implemented internally the pervasive way it was used down into the heart of physics and reconstruction software. We should avoid that this situation will reproduce itself at LHC u Components should be independent and isolated u The use of external products should be limited to their own domain u Avoid gratuitous and exaggerated re-use One hammer does not fit all screws u The core of the physics and reconstruction software should not depend from the details of the event-processing framework The cost of code duplication and adapter-layers will pay as soon as a component will need to be replaced u Happening Central Software Vincenzo Innocente, CERN/EP already: exit objectspace enter boost. exit Objy, enter Root. Internal Review of CMS Software and Computing 19 September 2002, CERN
Baseline: not a mere shopping list Objective is a manageable component architecture made of units realistically replaceable Define the global software model u Granularity, role and nature of “Modules” Physical vs logical modules (At last CMS plenary M. Livny concluded asking for statically linked, checkpointable executables…) u u Reuse model of sub-components Which integration technologies (“glues”) have to be used, where and how Define THE set of basic components u u u Their functionalities The domain of their applicability Identify independent “vertical slices” and model their integration Agree on Metrics to measure modularity u Not only Frameworks, also applications based on them Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Metrics: NCCD vs Cycles ATLAS ROOT ORCA G 4 COBRA Anaphe IGUANA Toolkits & Frameworks Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
User Entry-Point Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Persistent Classes Central Software Vincenzo Innocente, CERN/EP User Entry-Point Internal Review of CMS Software and Computing 19 September 2002, CERN
Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Prototyping new Persistency Solution June-December 2001 Oracle Evaluation u Oracle 9 i judged not mature enough to be used as Event Store u. November 2001 -September 2002 Root Evaluation u Vanilla ROOT: use ROOT-Trees to model CMS event u COBRA-ROOT: use COBRA event model replacing oo. Ref with TRef u. April 2002 - u Remove è PRS LCG Hybrid Store or isolate explicit Objectivity dependencies in COBRA does not have any already now u Produce a description of the ROOT file format u Model Meta. Data using a RDBMS u Identify all missing features required by a “production-grade” software u. January-December 2002 Evaluation of RDBMS for Analysis u Use RDBMS to model user data (event tags, n-tuple) u Oracle 9 i, Java, distributed WEB services u Postgre. SQL, C++, Python, Lizard Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
COBRA-ROOT Minimal perturbation of current framework, PRS software and production environment Usage of minimal ROOT specific functionalities u Based on new ROOT functionalities such as TRefs and support for std containers Full port of all COBRA persistency to Root (event and metadata) u Integration with a RDBMS or flat-file Meta. Data catalog should be easy Easy to port to LCG-POOL implementation May suffer performance penalties w. r. t. a full “Rootification” u Prototype performance already similar to current Objy implementation Compatibility with Root-data-analysis unclear Current prototype able to run DAQ-TDR jobs sequentially u More work required for parallel population of a dataset Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
“Vanilla” ROOT Full exploitation of all ROOT capabilities as event modeler and manager u Use of Trees, Folder, Clone. Arrays etc u Optimization of data structure to match ROOT requirements u Use foreign classes for “data” objects (mitigate impact on PRS) Heavy impact on framework and partially on PRS code Only the event has been re-modeled, Meta. Data require a new model based on a RDBMS Should give maximal ROOT performance Full compatibility with ROOT as analysis environment Porting to final LCG-POOL will require major changes u LCG-POOL is supposed to hide the use of implementation-specific data-structure such as TTrees Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
A temporary persistent solution During next CMS week final reports on the two prototypes will be presented We propose to base the new version of COBRA (COBRA 7) on the COBRA-Root prototype as it offers a seamless transition from the current architecture We suggest to maintain the Vanilla-Root prototype as a test-bed for new ROOT features and to study integration issues between POOL and ROOT in a controlled environment The expertise developed in the Vanilla-Root prototype should also be used to implement analysis objects based on Root-trees in the framework of COBRA 7. Working products (COBRA ORCA OSCAR IGUANA) based on the new persistency solution, with all the current basic functionalities, should be ready for January 2003. The ability to test and integrate LCG-POOL components, as they are released, should be an integral part of the new product. Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Simulation Full reengineering of all framework aspects of CMS simulation software (OSCAR and FAMOS) u Inline with reconstruction framework u Full re-use of common interfaces, mechanisms u Integration of DDD and components Cleaner division between CCS and PRS responsibilities u Generic Mechanisms in COBRA (Mantis u CMS specific software in OSCAR u Specific packages to host detector code sub-system) This work incorporates and expands experiences from preexisting software from CMS and ATLAS We consider it as an important contribution toward a future LCG simulation framework Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Mantis Packages Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Mantis application Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Detector Description DDD marked a major milestone with a full round-trip within Geant 4 u Full persistency mechanism for G 4 geometry Integration tests with ORCA to replace G 3 geometry model are in progress (Tracker ready) CMS Prototype essentially finished: u Not CMS specific u Supports several geometry models u Not bound to XML u Rely on external geometry modelers (currently G 3 or G 4) An RTAG on the subject is in progress and we hope it will recommend a common project Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Detector Description Data. Base Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Visualization Specialized Framework (IGUANA) u Working on full integration with COBRA Interactive ORCA Visualisation u Reconstruction Geometry u Reconstructed Objects u Monte. Carlo truth Interactive Geant 4 Visualisation u Explore and visualise è the physical volume tree, with all the usual IGUANA 3 D features: view, picking, slices, … è Trajectories (soon hits) è Magnetic field u Integrated è Find with DDD overlap detection overlaps, show result details in a list and highlight overlaps in 3 D u Geant 4 command line u A wizard to guide through OSCAR settings u G 3 functionalities (DCUT, DTREE) u Volume tree selectors (using attributes: by material, sensitive, etc Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Technology tracking and evaluation Tracking the evolution of technology is essential to anticipate the need for changes. Recent activities: u Python u XML u RDBMS u Template Meta Programming (boost, Loki) u Math libraries (gsl, TNT, Blitz) u New versions of Qt, OIV u gcc 3. 1, icc, Solaris CC 5. 4 u Non intrusive performance profilers (valgrind, oprofile) CMS software R&D activities have been always tightly coupled with LCB RD-projects first, and IT/DB and IT/API later. Today they are fully integrated into the activities of LCG. Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Schedule Oct 02 COBRA 7 prototype Nov 02 first Hybrid store release Jan 03 COBRA 7 first release Jan 03 start LCG integration Jun 03 first LCG-1 release Jun 03 release production sw Jun 03 Hybrid event store Jul 03 Start Preparation DC 04 Sep 03 baseline DC 04 software Nov 03 LCG-1 fully operational Jan 04 release DC 04 software Feb 04 DC 04 Apr 04 distributed analysis Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
Conclusions We have to guarantee a working software system that satisfies CMS needs at any given time Past and recent experience shows that we should make sure that any of its component can be replaced in a timescale of one year with negligible negative impact on PRS. LCG software will provide the foundation of CMS baseline u Expect to integrate first LCG components early next year u DC 04 will make use of LCG-1 software u DC 04 preparation may still require ad-hoc solutions DC 04 experience will tell us where to go next Central Software Vincenzo Innocente, CERN/EP Internal Review of CMS Software and Computing 19 September 2002, CERN
- Slides: 34