Eco Grid in SEEK A Data Grid System
- Slides: 26
Eco. Grid in SEEK A Data Grid System for Ecology Bertram Ludaescher University of California, Davis Arcot Rajasekar San Diego Supercomputer Center, University of California, San Diego
What is SEEK? n Science Environment for Ecological Knowledge (SEEK) n Multidisciplinary research project to create: n Distributed data network (Eco. Grid) n n Environmental, ecological, and systematics data Scalable systems for scientific analysis (workflow systems) Systems for semi-automated data and model integration Collaborators n n NCEAS, UNM, SDSC, U Kansas Vermont, Napier, ASU, UNC
Science Environment for Ecological Knowledge Research Objectives n Access to ecological, environmental, and biodiversity data n n n Enable data sharing & re-use Enhance data discovery at global scales Scalable analysis and synthesis n Taxonomic, Spatial, Temporal, Conceptual integration of data n n Address data heterogeneity issues Enable communication and collaboration for analysis Enable re-use of analytical components Collaborators n n NCEAS, UNM, SDSC, U Kansas Vermont, Napier, ASU, UNC
SEEK Overview
SEEK Components Science Environment for Ecological Knowledge n Kepler n n Eco. Grid n n n Making diverse environmental data systems interoperate Semantic Mediation System n n Modeling scientific workflows “Smart” data discovery and integration Knowledge Representation WG Taxon WG BEAM WG Education, Outreach, Training
SEEK Eco. Grid n Goal: allow diverse environmental data systems to interoperate n n n Hides complexity of underlying systems using lightweight interfaces Integrate diverse data networks from ecology, biodiversity, and environmental sciences Data systems n n Any system can implement these interfaces Prototyping using: n n Metacat, SRB, Di. GIR, Xanthoria, etc. Supports multiple metadata standards n EML, Darwin Core as foci
Eco. Grid client interactions n Modes of interaction n n Eco. Grid Registry n n n Client-server Fully distributed Peer-to-peer Node discovery Service discovery Aggregation services n n n Centralized access Reliability Data preservation
Ecogrid Focus n Data and Metadata n n n Distributed Data XML-based Metadata Service to Semantic Mediation Layer n Access to Ontologies and Taxon Services n Helping with Semantic Data Integration Service to Analysis and Modelling Layer n Interaction with Kepler - Workflows n Interaction with Grid Computing Facilities Access to Legacy Apps n Life. Mapper n Spatial Data Workbench
Eco. Grid Node
Layers in Eco. Grid
Ecological Metadata Language n Metadata: a means to manage ecological data n n n There is no universal data model for ecology Accommodate heterogeneity and dispersion EML n n Common language for archiving and transporting data Discovery information n Creator, Title, Abstract, Keyword, etc. n Content Context Physical, logical structure n SEEK will add semantic structure n n
An Example EML Document <? xml version="1. 0"? > <eml: eml package. Id="pisco. UCSB. 5. 20" system="knb" xmlns: eml="eml: //ecoinformatics. org/eml-2. 0. 0"> <dataset> <short. Name>Alegria Temperatures</short. Name> <title>PISCO: Intertidal Temperature Data: Alegria, California: 1996 -1997</title> <creator id="C. Blanchette"> <individual. Name> <given. Name>Carol</given. Name> <sur. Name>Blanchette</sur. Name> </individual. Name> <organization. Name>PISCO</organization. Name> <address> <delivery. Point>UCSB Marine Science Institute</delivery. Point> <city>Santa Barbara</city> <administrative. Area>CA</administrative. Area> <postal. Code>93106</postal. Code> </address> </creator> <abstract> <para>These temperature data were collected at Alegria Beach, California, and were. . . </para> </abstract> <keyword. Set> <keyword>Oceanographic. Sensor. Data</keyword> <keyword>Thermistor</keyword> <keyword. Thesaurus> PISCOCategories </keyword. Thesaurus> </keyword. Set> <intellectual. Rights><para>Please contact the authors for permission to use these data. Please also acknowledge the authors in any publications. </para> </intellectual. Rights> <contact> <references>C. Blanchette</references> </contact> </dataset> </eml: eml> Transform
Metadata driven data ingestion n Key information needed to read and machine process a data file is in the metadata n n File descriptors (CSV, Excel, RDBMS, etc. ) Entity (table) and Attribute (column) descriptions n n n Name Type (integer, float, string, etc. ) Codes (missing values, nulls, etc. ) Integrity constraints In the future, this will include semantic typing
Heterogeneous Data integration n Requires advanced metadata and processing n n Attributes must be semantically typed Collection protocols must be known Units and measurement scale must be known Measurement relationships must be known n e. g. , that Areal. Density=Count/Area
Ecological ontologies n What was measured (e. g. , biomass) Type of measurement (e. g. , Energy) Context of measurement (e. g. , Psychotria limonensis) How it was measured (e. g. , dry weight) n SEEK intends to enable community-created ecological ontologies using OWL n n n Represents a controlled vocabulary for ecological metadata More about this in Bertram’s talk
Eco. Grid Resources NTL AND HBR VCR LUQ Metacat node Veg. Bank node Xanthoria node SRB node Di. GIR node Legacy system LTER Network (24) Natural History Collections (>> 100) Organization of Biological Field Stations (180) UC Natural Reserve System (36) Partnership for Interdisciplinary Studies of Coastal Oceans (4) Multi-agency Rocky Intertidal Network (60)
Eco. Grid Resources Eco. Grid Diggir Meta. Cat Registry Eco. Grid SRB Xanthoria Veg. Bank
Eco. Grid Node
Eco. Grid Query Service Ecogrid Query adopts a query schema, Query Document Schema, as a common query language within Ecogrid. <? xml version="1. 0" encoding="UTF-8"? > <egq: query. Id="test. 1. 1" system="http: //knb. ecoinformatics. org" xmlns: egq="ecogrid: //ecoinformatics. org/ecogrid-query-1. 0. 0 beta 1" <AND> xmlns: xsi="http: //www. w 3. org/2001/XMLSchema-instance" xsi: schema. Location="ecogrid: //ecoinformatics. org/ecogrid-query-1. 0. 0 beta 1. . /src/xsd/query. xsd"> <condition operator="LIKE" concept="mdas. Collection. Name">/home/whywhere. seek</condition> <namespace prefix="eml">eml: //ecoinformatics. org/eml-2. 0. 0</namespace> <condition operator="LIKE" concept="ORIGINAL DATA SET">%World Geodetic System%</condition> <returnfield>size</returnfield> <returnfield>owner</returnfield> <condition operator="EQUALS" <returnfield>min. value</returnfield> concept="max. value">39. 11</condition> <returnfield>max. value</returnfield> </AND> <!-- <returnfield>value units</returnfield> --> <title>metadata query for Eco Models</title> <AND> <condition operator="LIKE" concept="mdas. Collection. Name">/home/whywhere. seek</condition> <condition operator="LIKE" concept="ORIGINAL DATA SET">%World Geodetic System%</condition> <condition operator="EQUALS" concept="max. value">39. 11</condition> </AND> </egq: query>
Ecogrid Services implementation for GET/PUT q The ‘get’ call from ecogrid client enables retrieval of the content of a dataset/file such as SRB, Meta. Cat. q The ‘get’ function also be enables SQL querying of relational databases (Oracle, DB 2, etc), which are preregistered as a data source in SRB. q. Put for data: Ecogrid put service allows users to create (upload) files into Eco. Grid resources such as Met. Cat, SRB. q Put for metadata: Ecogrid put service also allows ingestion of metadata such as EML in Meta. Cat or Userdefined metadata in SRB.
Eco. Grid Queries in Kepler
EML Metadata Display in Kepler
Eco. Grid Sources in Kepler
Query Builder
Status n n n Read, Query & Register Completed Simple Registry Operational Eco. Grid Wrappers completed for: n n n Meta. Cat SRB Di. GGi. R Xanthoria Available Interfaces n n n WSDL Simple Web Interactivity Kepler
Acknowledgements This material is based upon work supported by: The National Science Foundation under Grant Numbers 9980154, 9904777, 0131178, 9905838, 0129792, and 0225676. The National Center for Ecological Analysis and Synthesis, a Center funded by NSF (Grant Number 0072909), the University of California, and the UC Santa Barbara campus. The Andrew W. Mellon Foundation. PBI Collaborators: NCEAS, University of New Mexico (Long Term Ecological Research Network Office), San Diego Supercomputer Center, University of Kansas (Center for Biodiversity Research) Kepler contributors: SEEK, Ptolemy II, SDM/Sci. DAC, GEON
- They seek him here they seek him there
- Secuencia gradiente eco
- What is gst eco system
- Eco-system
- Land grid array vs pin grid array
- The device handler determines dash
- Seek wise counsel
- The cask of amontillado assessment answers
- Luke seek first the kingdom
- In what two ways does galileo seek to appease the church?
- Improve 3. hali
- Goal seek automates the trial-and-error
- Seek time in magnetic disk
- Informed user of information systems
- Ca ppm entwicklung
- Seek the old paths
- We must first seek to understand
- Seek and save the lost
- Maranatha lord we seek you lyrics
- Maranatha lord we seek you
- How to write a letter seeking information
- I seek education
- Seek ye first the kingdom
- O god you are my god earnestly i seek you
- Habit number 5
- Self-centered listening
- Seek first to understand then to be understood activities