Information and Monitoring The European Data Grid Project

  • Slides: 31
Download presentation
Information and Monitoring The European Data. Grid Project Team http: //www. eu-datagrid. org

Information and Monitoring The European Data. Grid Project Team http: //www. eu-datagrid. org

Contents u Grid Information Systems u GMA and R-GMA u Tools and APIs EU

Contents u Grid Information Systems u GMA and R-GMA u Tools and APIs EU Data. Grid: Information and Monitoring 2

Features of a grid information system u Provides n The Grid itself s s

Features of a grid information system u Provides n The Grid itself s s n Mainly for the middleware packages The user may query it to understand the status of the Grid applications s For users u Flexible n n n information on both: infrastructure Able to cope with nodes in a distributed environment with an unreliable network Dynamic addition and deletion of information producers Security system able to address the access to information at a fine level of granularity n Allow new data types to be defined n Scaleable n Good performance n Standards based EU Data. Grid: Information and Monitoring 3

GMA u From Producer execute or stream Consumer St ca ore tio n lo

GMA u From Producer execute or stream Consumer St ca ore tio n lo Registry up k o ion o L cat lo GGF u Very simple model u Does not define: n Data model n Data transfer mechanism n Registry implementation EU Data. Grid: Information and Monitoring 4

R-GMA Producer execute or stream Consumer u Use the GMA from GGF u A

R-GMA Producer execute or stream Consumer u Use the GMA from GGF u A relational implementation n St ca ore tio n lo s s Registry up k o ion o L cat lo Powerful data model and query language u u All data modelled as tables SQL can express most queries in one expression Applied to both information and monitoring Creates impression that you have one RDBMS per VO EU Data. Grid: Information and Monitoring 5

Relational Data Model in R-GMA u u u Not a general distributed RDBMS system,

Relational Data Model in R-GMA u u u Not a general distributed RDBMS system, but a way to use the relational model in a distributed environment where global consistency is not important Producers announce: SQL “CREATE TABLE” publish: SQL “INSERT” Consumers collect: SQL “SELECT” Some producers, the Registry and Schema make use of RDBMS as appropriate – but what is central is the relational model All R-GMA tuples are time-stamped EU Data. Grid: Information and Monitoring 6

Example: 2 tables u Service URI VARCHAR(255) URI to contact the service VO VARCHAR(50)

Example: 2 tables u Service URI VARCHAR(255) URI to contact the service VO VARCHAR(50) Where info should be published – or an empty string to indicate all type VARCHAR(50) Type of service email. Contact VARCHAR(50) The e-mail of a human being to complain to site VARCHAR(50) Domain name of site hosting the service secure VARCHAR(1) ‘y’ or ‘n‘ - indicates whether or not this is a secure service major. Version INT Version of protocol not implementation minor. Version INT Version of protocol not implementation patch. Version INT Version of protocol not implementation URI VARCHAR(255) URI to contact the service status INT status code. 0 means the service is up. message VARCHAR(255) Message corresponding to status code u Service. Status EU Data. Grid: Information and Monitoring 7

SQL example 1 u SELECT DISTINCT type FROM Service +-------------------+ | type | +-------------------+

SQL example 1 u SELECT DISTINCT type FROM Service +-------------------+ | type | +-------------------+ | Grid. FTP | | GRIS | | RFIO | | R-GMA. Resilient. Stream. Producer. Service | | R-GMA. Archiver. Service | | R-GMA. Stream. Producer. Service | | R-GMA. Canonical. Producer. Service | | R-GMA. DBProducer. Service | | R-GMA. Latest. Producer. Service | | GIN | | R-GMA. Registry. Service | | R-GMA. Schema. Service | | R-GMA. Browser. Service | | GOUT | | edg-netmon | | edg-iperf | | edg-udpmon | | myproxy | | edg-pinger | +-------------------+ 25 Rows in set EU Data. Grid: Information and Monitoring 8

SQL Example 2 SELECT Service. site, Service. Status. status, Service. Status. message, Service. URI

SQL Example 2 SELECT Service. site, Service. Status. status, Service. Status. message, Service. URI FROM Service, Service. Status WHERE Service. URI = Service. Status. URI AND Service. Status. status <> 0 AND Service. Type = 'GIN' +------+------------+--------------+ | site | status | message | URI | +------+------------+--------------+ | nikhef. nl | 2 | Gin is stopped | http: //tbn 03. nikhef. nl/GIN | | nikhef. nl | 2 | Gin is stopped | http: //tbn 09. nikhef. nl/GIN | | nikhef. nl | 2 | Gin is stopped | http: //tbn 16. nikhef. nl/GIN | +------+------------+--------------+ 3 Rows in set EU Data. Grid: Information and Monitoring 9

Data Transfer: Producer Consumer u Consumer n Similar to normal database query u Consumer

Data Transfer: Producer Consumer u Consumer n Similar to normal database query u Consumer n can issue one-off queries can also start a continuous query Requests all data published which matches the query s As new data matching the query is produced it is streamed to the Consumer s Can be seen as an alert mechanism s Remember that all tuples carry a time-stamp EU Data. Grid: Information and Monitoring 10

3 Kinds of Query insert select Continuous Queries Stream. Producer Tuple Tuple Tuple Tuple

3 Kinds of Query insert select Continuous Queries Stream. Producer Tuple Tuple Tuple Tuple History Query Tuple Latest. Producer Tuple Data. Base. Producer Tuple Latest Queries Tuple Tuple EU Data. Grid: Information and Monitoring 11

Producers u u Stream. Producer – Supports Continuous Queries n In memory data structure

Producers u u Stream. Producer – Supports Continuous Queries n In memory data structure n Can define minimum retention period Data. Base. Producer – Supports History Queries n Information not lost n Supports joins n Clean up strategy Latest. Producer – Supports Latest Queries n As Data. Base. Producer but n Just holds the latest information for any “primaryish” key Canonical. Producer – Supports anything n Offers “anything” as relations n User has to write code to handle SQL etc. EU Data. Grid: Information and Monitoring 12

Registry and Schema u Registry has two main tables: n Producer execute or stream

Registry and Schema u Registry has two main tables: n Producer execute or stream Consumer S lo tor ca e tio n Registry n up k o ion o L cat lo Store table description u Schema s Table name s Predicate s Location Consumer s Query s Location Schema holds description of tables n u Producer Column names and types of each table Registry predicate defines subset of “global” table EU Data. Grid: Information and Monitoring 13

Contributions to the “global” table CPULoad (Global Schema) Country Site Facility Load Timestamp UK

Contributions to the “global” table CPULoad (Global Schema) Country Site Facility Load Timestamp UK RAL CDF 0. 3 19055711022002 UK RAL ATLAS 1. 6 19055611022002 UK GLA CDF 0. 4 19055811022002 UK GLA ALICE 0. 5 19055611022002 CH CERN ALICE 0. 9 19055611022002 CH CERN CDF 0. 6 19055511022002 CPULoad (Producer 2) CPULoad (Producer 1) UK RAL CDF 0. 3 19055711022002 UK RAL ATLAS 1. 6 19055611022002 WHERE country = ’UK’ AND site = ’RAL’ WHERE country = ’CH’ AND site = ’CERN’ UK GLA CDF 0. 4 19055811022002 UK GLA ALICE 0. 5 19055611022002 CPULoad (Producer 3) CH CERN ATLAS 1. 6 19055611022002 CH CERN CDF 0. 6 19055511022002 EU Data. Grid: Information and Monitoring 14

Mediator u Queries u The posed against a virtual data base Mediator must: n

Mediator u Queries u The posed against a virtual data base Mediator must: n find the right Producers n combine information from them u Hidden component – but vital to R-GMA u Will eventually support full distributed queries but for now will only merge information: n from multiple producers for queries on one table n or over multiple tables from one producer EU Data. Grid: Information and Monitoring 15

Queries over “global” table – merging streams SELECT * from CPULoad WHERE country =

Queries over “global” table – merging streams SELECT * from CPULoad WHERE country = ’UK’ CPULoad (Consumer) Country Site Facility Load Timestamp UK RAL CDF 0. 3 19055711022002 UK RAL ATLAS 1. 6 19055611022002 UK GLA CDF 0. 4 19055811022002 UK GLA ALICE 0. 5 19055611022002 CPULoad (Producer 2) CPULoad (Producer 1) UK RAL CDF 0. 3 19055711022002 UK RAL ATLAS 1. 6 19055611022002 Mediator handles merging information from multiple producers for queries on one table UK GLA CDF 0. 4 19055811022002 UK GLA ALICE 0. 5 19055611022002 CPULoad (Producer 3) CH CERN ATLAS 1. 6 19055611022002 CH CERN CDF 0. 6 19055511022002 EU Data. Grid: Information and Monitoring 16

Queries over “global” table – joining tables SELECT Service. URI Service. email. Contact from

Queries over “global” table – joining tables SELECT Service. URI Service. email. Contact from Service S, Service. Status SS WHERE (S. URI= SS. URI and SS. up=‘n’) Service/Service. Status (Consumer) URI email. Contact gppse 02 sysad@rl. ac. uk Service/Service. Status (Latest Producer) Service URI VO type email. Contact site gppse 01 alice SE sysad@rl. ac. uk RAL … … gppse 01 atlas SE sysad@rl. ac. uk RAL … … gppse 02 cms sysad@rl. ac. uk RAL … … lxshare 0404 alice SE sysad@cern. ch CERN … … … lxshare 0404 atlas SE sysad@cern. ch CERN … … URI … SE secure major. Version minor. Version patch. Version … Service. Status … up message gppse 01 y SE is running gppse 02 n SE ERROR 101 lxshare 0404 y SE is running EU Data. Grid: Information and Monitoring 17

Archiver (Re-publisher) u It n is a combined Consumer-Producer Follows the GMA concept but

Archiver (Re-publisher) u It n is a combined Consumer-Producer Follows the GMA concept but packaged for ease of use u You just have to tell it what to collect and it does so on your behalf u Re-publishes to any kind of “Insertable” (i. e. not to the Canonical. Producer) n Can support joins if archiving to a Data. Base. Producer or a Latest. Producer EU Data. Grid: Information and Monitoring 18

Topologies SP u Normally A u Archivers instantiated with a Producer and a Predicate

Topologies SP u Normally A u Archivers instantiated with a Producer and a Predicate DBP SP SP publish via SP n Often no predicate u Must avoid cycles in the graph A SP A HP SP EU Data. Grid: Information and Monitoring 19

GIN and GOUT (Gadget IN and Gadget OUT) LDAP Info. Provider GLUE Schema GIN

GIN and GOUT (Gadget IN and Gadget OUT) LDAP Info. Provider GLUE Schema GIN Consumer (CE) Consumer (Site. Info) Stream Producer R-GMA Archiver Latest Producer RDBMS GOUT Stream Producer GIN Consumer API LDAP Server R-GMA Consumers LDAP Info. Provider EU Data. Grid: Information and Monitoring 20

Ranglia u R-GMA u. A meets Ganglia Canonical. Producer is used to interface Ganglia

Ranglia u R-GMA u. A meets Ganglia Canonical. Producer is used to interface Ganglia u Allows n R-GMA queries to be made to Ganglia Not yet released EU Data. Grid: Information and Monitoring 21

R-GMA Tools u u R-GMA Browser n Application dynamically generating web pages n Supports

R-GMA Tools u u R-GMA Browser n Application dynamically generating web pages n Supports pre-defined and user-defined queries R-GMA CLI (edg-rgma) n Command Line Interface (similar to My. SQL) n Supports single query and interactive modes n u Can perform simple operations with Consumers, Producers and Archivers R-GMA packaged SQL (edg-rgma-util) n e. g. edg-rgma-util contacts: s Command: SELECT site. Name, sys. Admin. Contact, user. Support. Contact, site. Security. Contact FROM Site. Info EU Data. Grid: Information and Monitoring 22

EU Data. Grid: Information and Monitoring 23

EU Data. Grid: Information and Monitoring 23

edg-rgma u show tables u describe Service. Status u show producers of Service. Status

edg-rgma u show tables u describe Service. Status u show producers of Service. Status u latest select * from Service. Status u old continuous select * from Service. Status EU Data. Grid: Information and Monitoring 24

edg-rgma – Example $> edg-rgma> stream declare user. Table rgma> stream minret 0. 2

edg-rgma – Example $> edg-rgma> stream declare user. Table rgma> stream minret 0. 2 rgma> stream INSERT into user. Table (user. Id, a. String, a. Real, an. Int) values ('fisher', 'hello', 3. 162, 21) rgma> timeout 0. 3 rgma> old continuous SELECT * from user. Table +---------+-------+-----------------+ | user. Id | a. String | a. Real | an. Int | Measurement. Date | Measurement. Time | +---------+-------+-----------------+ | fisher | hello | 3. 162 | 21 | 2003 -11 -11 | 11: 06: 01 | +---------+-------+-----------------+ 1 Rows in set EU Data. Grid: Information and Monitoring 25

APIs u Exist in Java, C++, C, Python and Perl u C, Python and

APIs u Exist in Java, C++, C, Python and Perl u C, Python and Perl follow an object based style reflecting the Java and C++ APIs Java my. Producer = new Stream. Producer(); C++ my. Producer= new edg: : info: : Stream. Producer(); C my. Producer = Stream. Producer_new(); Perl $my. Producer = edg_rgma_perl: : Stream. Producer_new(); Python my. Producer = edg_rgma_python. Stream. Producer_new() my. Producer = rgma. Stream. Producer() or EU Data. Grid: Information and Monitoring 26

Some Times… u Termination. Interval n Period by which the producer must re-announce its

Some Times… u Termination. Interval n Period by which the producer must re-announce its existence s If it fails to do so it will be removed from the registry s Default is 20 minutes s Don’t set it too short s Don’t set it too long u Retention. Period n Period for which the published data will remain available, even after the Producer has been closed s Default is 0 EU Data. Grid: Information and Monitoring 27

C++ Producer - Example #include … #include "info/Stream. Producer. hh" int main(int argc, char*

C++ Producer - Example #include … #include "info/Stream. Producer. hh" int main(int argc, char* args[]) { if (argc != 2) { std: : cout << "One argument must be specifiedn" << std: : endl; exit(1); } try { edg: : info: : Stream. Producer my. Producer; std: : string astring = std: : string("WHERE (user. Id = '") + std: : string(args[1]) + std: : string("')"); std: : cout << "Predicate: " << astring << std: : endl; my. Producer. declare. Table("user. Table", astring); my. Producer. set. Termination. Interval(edg: : info: : Time. Interval(1200)); my. Producer. set. Min. Retention. Period(edg: : info: : Time. Interval(600)); astring = std: : string("INSERT INTO user. Table (user. Id, a. String, a. Real, an. Int) VALUES ('") + std: : string(args[1]) + std: : string("', 'C++ producer', 3. 1415962, 42)"); std: : cout << astring << std: : endl; my. Producer. insert(astring); } catch (edg: : info: : RGMAException& e) { std: : cout << "Exception " << e. what() << std: : endl; } } EU Data. Grid: Information and Monitoring 28

C++ Consumer - Example #include … #include "info/Consumer. hh" #include "info/Result. Set. hh" int

C++ Consumer - Example #include … #include "info/Consumer. hh" #include "info/Result. Set. hh" int main(){ try { edg: : info: : Consumer my. Consumer("SELECT * FROM user. Table“, edg: : info: : Consumer: : LATEST); edg: : info: : Time. Interval Timeout(60); my. Consumer. start(Timeout); while(my. Consumer. is. Executing()){ sleep(1); } if(my. Consumer. has. Aborted()){ std: : printf("Consumer query timed-outn"); } edg: : info: : Result. Set result. Set = my. Consumer. pop. If. Possible(); if (result. Set) { std: : printf("Result. Set: %sn", result. Set. to. String(). c_str()); } } catch (edg: : info: : RGMAException& e) { std: : printf("Exception: %sn", e. what()); } } EU Data. Grid: Information and Monitoring 29

Summary u R-GMA n is suitable for Information and Monitoring n is a relational

Summary u R-GMA n is suitable for Information and Monitoring n is a relational implementation of the GGF’s GMA n has different Producer types n mediator creates the impression of a single RDBMS n has authentication using grid certificates n has been integrated with Ganglia n has an API available in multiple languages EU Data. Grid: Information and Monitoring 30

Further Information u Information n and Monitoring Services http: //hepunx. rl. ac. uk/edg/wp 3/

Further Information u Information n and Monitoring Services http: //hepunx. rl. ac. uk/edg/wp 3/ u R-GMA n http: //www. r-gma. org/ EU Data. Grid: Information and Monitoring 31