Enabling Grids for Escienc E APEL CPU Accounting
Enabling Grids for E-scienc. E APEL CPU Accounting in the EGEE/WLCG infrastructure Cristina del Cano Novales, John Gordon STFC - RAL www. eu-egee. org EGEE-III INFSO-RI-222667 EGEE and g. Lite are registered trademarks
Summary Enabling Grids for E-scienc. E • • • Overview APEL Client Data Transportation Accounting Data Centre EGEE Accounting Portal APEL SAM tests Standards Status Future Plans EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 2
APEL - Overview Enabling Grids for E-scienc. E • • APEL (Accounting Processor for Event Logs) Data collection and reporting services Large centralised Database Collects and aggregates CPU usage information from sites across the Grid EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 3
APEL - Overview Enabling Grids for E-scienc. E EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 4
Some Statistics Enabling Grids for E-scienc. E • Storing ~200 M individual job records since 2004 • And more than 100 M records in aggregated summaries from other Grids (OSG/NDGF) • 442 different sites • 673 M CPU hours – 28 M days – 76000 years EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 5
APEL Client Enabling Grids for E-scienc. E • Log processing application • Interprets system log files (gatekeeper and batch system logs) to produce accounting records • Currently supports PBS, LSF, SGE, CONDOR, but could be extended to support other systems • APEL collects usage information after the job was completed • Distributed as part of the g. Lite Middleware EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 6
APEL Client Enabling Grids for E-scienc. E EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 7
Data Transportation Enabling Grids for E-scienc. E • Currently using two different interfaces – R-GMA (Relational Grid Monitoring Architecture) § Majority of EGEE sites publishing via APEL-RGMA § Some EGEE sites using own sensor and APEL publisher (with RGMA) – Direct My. SQL insertion § OSG – Gratia § INFN – DGAS § NDGF – SGAS EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 8
Data Transportation Enabling Grids for E-scienc. E EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 9
Accounting Data Centre Enabling Grids for E-scienc. E • Receives records from R-GMA • Processes and stores the accounting records produced by the grid resources, including: – Decryption of the User. DNs – VOMS-level: § Extraction of VO, primary Group and Role from the User. FQAN – Normalisation: § For each tuple a normalised CPU and Wall time is determined based on the Spec. Int 2000 value and the raw CPU and Wall times – Aggregation: § Anonymous and User-level summaries are generated. – Encryption: § The User-level summaries are encrypted before they are sent to the CESGA Accounting portal. Access to these summaries is controlled using SSL and ACLs. EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 10
Accounting Data Centre Enabling Grids for E-scienc. E - g. Lite MON box Receives data from RGMA Data stored for 3 days Open access so personal data encrypted- EGEE-III INFSO-RI-222667 Main repository for Accounting Data “Offline” – Not accessible. Contains all accounting data since 2005 Archiving of records dependent on Policy document being drafted Contains summaries for the Accounting Portal Contains dedicated tables for OSG, INFN, NDGF Personal data encrypted with Portal’s public key Creates and publishes APEL SAM tests APEL CPU Accounting in EGEE/WLCG 11
EGEE Accounting Portal Enabling Grids for E-scienc. E http: //www 3. egee. cesga. es/gridsite/accounting/CESGA/egee_view. php EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 12
APEL SAM Tests Enabling Grids for E-scienc. E • Provide monitoring for APEL for Production Sites • Two tests provided – APEL-pub: Critical test. Checks date of the latest record published. § Older than 7 days => Warn § Older than 31 days => Error (Site notified) – APEL-sync: Compares the number of records on the Central Database with the number of records on the local database. § >10 records diff => Warn § > 100 records diff => Error EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 13
APEL Plan Enabling Grids for E-scienc. E • 1. 2. 3. 4. • • • Sites should: Measure resources using HEPSPEC 06 *250 Publish as SI 00 as before Set Glue HEPSPEC 06 with value. This shows new benchmark has been used APEL gathers comparable data from all sites Monitoring can identify sites which have/have not changed benchmark. Raise tickets etc CESGA Portal can show usage in either (both? ) benchmarks by conversion. Eventually when most sites have changed, the portal default will change When GLUE 2. 0 deployed, publish raw HEPSPEC 06 values – Or possibly reuse SI 00 EGEE-III INFSO-RI-222667 EGEE transition plan - Bob Jones – CB - 3 March 2009 14
Future Plans Enabling Grids for E-scienc. E • Main ideas: – Active. MQ to replace R-GMA as the transport mechanism § Interoperability with other tools § Expertise easily available § Using existing infrastructure – New architecture to allow regionalisation but not impose it – Maintain Central Repository for multi-grid/VO/user queries – Standard publishing methods – RUS (? ? ? ) EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 15
Future Plans Enabling Grids for E-scienc. E • Regionalisation: – Use Cases: § Region A: APEL – non regionalised § Region B: APEL – regionalised § Region C: Other sensor – own accounting system – Regions can be Grids EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 16
Future Plans – Current Architecture Enabling Grids for E-scienc. E EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 17
Future Plans – Future Architecture Enabling Grids for E-scienc. E EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 18
Future Plans - Regional Accounting Server Enabling Grids for E-scienc. E EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 19
Future Plans – Central Accounting Server Enabling Grids for E-scienc. E EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 20
Standards Enabling Grids for E-scienc. E • Already use OGF-UR – Participate in OGF WG • OGF-RUS as a standard interface for publishing – – Designed for XML Existing Accounting Service use relational databases General agreement on publishing More difficult to implement full xpath query on relational db EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 21
Status Enabling Grids for E-scienc. E • New APEL Client built in ETICS • Prototype consumer under test • First external site test August – Australia EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 22
Plan Enabling Grids for E-scienc. E • EGEE III Plan - By end of EGEE – Change to Active. MQ – Regionalise Regions where desired • EGI plan – – can distribute to NGIs – NGIs could implement their own accounting service and interface it like OSG, INFN, NDGF, . . . EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 23
Summary Enabling Grids for E-scienc. E • The infrastructure underlying APEL will change over the next year. • This should result in a more flexible and resilient service • The results will continue to be published through the same portal – so no changes seen to users • The new infrastructure will allow national accounting repositories and portals but still allow worldwide visualisation for worldwide Vos. EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 24
Questions Enabling Grids for E-scienc. E QUESTIONS. . . EGEE-III INFSO-RI-222667 APEL CPU Accounting in EGEE/WLCG 25
- Slides: 25