Bookkeeping Tutorial Bookkeeping content m m Contains records
Bookkeeping Tutorial
Bookkeeping content m m Contains records of all “jobs” and all “files” that are produced by production jobs Job: o In fact technically a “step” in a workflow P o o o For real RAW data: the “job” is in fact a DAQ run Has input files (except runs and Gauss) Has output files P P o Note that files may not be kept (i. e. have a replica) All files are registered in order to keep the full history Has metadata P m E. g. “Gauss step”, “Brunel step”… Location, production number, application, CPUTime, etc… Files: o o o Always output of a “job” Files are defined by an LFN (Logical File Name) Contain metadata P Number of events, size, event type, etc… Bookkeeping Tutorial 2
Bookkeeping purpose m Provenance database o Contains the full history of productions P m User dataset search o Select a list of files from selection criteria P P o Only files with a replica! Generate Gaudi configuration file Give also access to the job/file tree P m Traceability of datasets E. g. investigate history of a file Production datasets search o Select the dataset to be processed by production jobs P o Ensures consistency of input files for a production Uses directly the BK API to get the list of files Bookkeeping Tutorial 3
Bookkeeping partitioning m Configuration Name / version o Real data P o <DAQ partition> / <activity> Simulated data P “MC” / <activity> d <activity> : “ 2008” / “DC 06” / … m Conditions o Parameters of initial data P o All subsequent processed data inherit the “conditions” Real data P DAQ conditions d Beam conditions, energy, magnetic field, detector conditions… o Simulated data P Simulation conditions d Beam energy, magnetic field, luminosity, generator settings… Bookkeeping Tutorial 4
Processing pass m Associated to a level of processing o o Within a given partition (config name / version + conditions) Corresponds to the whole processing workflow P P o Specifies the processing pass of input data when applicable P o Single workflow for a given processing pass Compatible versions of applications Sequence of processing Re-processing creates branches Sim. Reco Gauss SIM Stripping Da. Vinci Boole ETC DIGI Brunel DST Bookkeeping Tutorial 5
Other query parameters m Event type o o File property Real data P P P o Simulated data P m 90000000 : real data full stream 90000001 : real data express stream Types to be defined for stripping streams LHCb convention for decay tree File type o Data content / format P Format not yet used Bookkeeping Tutorial 6
Running the bookkeeping GUI m Needs a valid Grid certificate Needs an X server m lhcb-bkk m o Setup. Project Dirac P o If needed: lhcb-proxy-init P o m Sets up the environment Creates a proxy dirac-bookkeeping-gui Individual commands can be issued from the prompt! Bookkeeping Tutorial 7
The query tree Bookkeeping Tutorial 8
More info m Right click on o o Conditions Processing pass Bookkeeping Tutorial 9
Event type and file type Bookkeeping Tutorial 10
Dataset selection Logical File name Bookkeeping Tutorial 11
Saving configuration (a. k. a. options) file m Python configuration (default) o o m Still possible to create. opts (discouraged!). txt file for just a list of LFNs All files or selected files (if any) Bookkeeping Tutorial 12
Dealing with PFNs or XML catalogs m Using ganga + DIRAC o Bookkeeping integrated in ganga: P o m dataset = browse. BK() LFN handling is then automatic… If you really need XML catalog or PFNs, usegen. XMLCatalog o o Ensures files are available on the specified site Gets the PFN from the Storage Element P Not constructed “by hand” Bookkeeping Tutorial 13
Dealing with XML catalog and PFNs Bookkeeping Tutorial 14
DIRAC Monitoring web portal
General information m Entry point to the DIRAC web portal o m Web implementation of (almost) a full desktop application o o o m http: //dirac. cern. ch Monitoring of productions / jobs Accounting (jobs, data management) Allows to take actions on jobs Authentication / authorisation is mandatory o o o Anonymous access gives minimal access Get a certificate and load it in our in your browserhttps: //twiki. cern. ch/twiki/bin/view/LHCb/FAQ/Certificate DIRAC authorisation through “DIRAC groups” P P Default: lhcb_user Other groups: lhcb_prod, dirac_admin… Future: specific groups per physics groups, PPG (for production authorisation)… Capabilities depends on the group DIRAC Monitoring Tutorial 16
The DIRAC portal home page Menus DIRAC instance DIRAC group DIRAC Monitoring tutorial Identity 17
Job Monitoring info Actions Selection DIRAC Monitoring tutorial 18
Job Monitoring (cont’d) m Selection o o For group lhcb_user, only see your own jobs Can select with P P m Columns o o m Status Site Date … Can tailor the columns to be displayed Clicking toggles the sorting in the column Rows o o Jobs displayed in pages (default 25 rows, don’t exceed 100) Can scroll pages DIRAC Monitoring Tutorial 19
Logging info DIRAC Monitoring Tutorial 20
Output peeking DIRAC Monitoring Tutorial 21
Attributes DIRAC Monitoring Tutorial 22
Parameters DIRAC Monitoring Tutorial 23
- Slides: 23