Using Resource Description Framework RDF to carry metadata

  • Slides: 37
Download presentation
Using Resource Description Framework (RDF) to carry metadata for datasets M. Benno Blumenthal and

Using Resource Description Framework (RDF) to carry metadata for datasets M. Benno Blumenthal and John del Corral International Research Institute for Climate and Society Open. DAP 2007 http: //iridl. ldeo. columbia. edu/ontologies/

RDF is important for Open. DAP because • By embedding Open. DAP in an

RDF is important for Open. DAP because • By embedding Open. DAP in an RDF document, metadata (a. k. a. attributes) not understood by Open. DAP code are easily carried in a semantically-valid way • Explicit relationships between Open. DAP variables can cleanly solve netcdf common name vs Open. DAP GRID/MAP structures, while avoiding retransmission of common independent variables • Explicit mapping between the different data models of the different Open. DAP APIs

RDF is important for Open. DAP because • Support for different languages can be

RDF is important for Open. DAP because • Support for different languages can be built on top of RDF object support, e. g. Ruby Active. RDF

Why RDF? Web-based system for interoperating semantics A key part of the Semantic Web

Why RDF? Web-based system for interoperating semantics A key part of the Semantic Web RDF/OWL is an interesting technology, but it is even more interesting when it is clear that it can help solve our problems

Standard Metadata Schema/Data Services Tools Datasets Users

Standard Metadata Schema/Data Services Tools Datasets Users

Many Data Communities Standard Metadata Schema Tools Standard Metadata Schema Datasets Users Standard Metadata

Many Data Communities Standard Metadata Schema Tools Standard Metadata Schema Datasets Users Standard Metadata Schema Tools Users Datasets Tools Users Standard Metadata Schema Tools Users Datasets

Super Schema Standard metadata schema Standard Metadata Schema Tools Datasets Users Standard Metadata Schema

Super Schema Standard metadata schema Standard Metadata Schema Tools Datasets Users Standard Metadata Schema Tools Users Datasets

Super Schema: direct Standard metadata schema/data service Standard Metadata Schema Tools Datasets Users Standard

Super Schema: direct Standard metadata schema/data service Standard Metadata Schema Tools Datasets Users Standard Metadata Schema Tools Users Datasets

Flaws • A lot of work • Super Schema/Service is the Lowest. Common-Denominator •

Flaws • A lot of work • Super Schema/Service is the Lowest. Common-Denominator • Science keeps evolving, so that standards either fall behind or constantly change

RDF Standard Data Model Exchange Standard metadata schema RDF RDF Standard Metadata Schema Tools

RDF Standard Data Model Exchange Standard metadata schema RDF RDF Standard Metadata Schema Tools Datasets Users RDF Standard Metadata Schema Tools Users Datasets

RDF Data Model Exchange Standard metadata schema RDF RDF Standard Metadata Schema RDF RDF

RDF Data Model Exchange Standard metadata schema RDF RDF Standard Metadata Schema RDF RDF Tools Datasets Users RDF Standard Metadata Schema Users RDF Standard Metadata Schem RDF Tools Datasets RDF Datasets Standard Metadata Schema Tools Users RDF Datasets Tools Users Datasets

RDF Architecture queries Virtual (derived) RDF RDF RDF RDF RDF

RDF Architecture queries Virtual (derived) RDF RDF RDF RDF RDF

Why is this better? • Maps the original dataset metadata into a standard format

Why is this better? • Maps the original dataset metadata into a standard format that can be transported and manipulated • Still the same impedance mismatch when mapped to the least-common-denominator standard metadata, but • When a better standard comes along, the original complete-but-nonstandard metadata is already there to be remapped, and “late semantic binding” means everyone can use the new semantic mapping • Can uses enhanced mappings between models that have common concepts beyond the least-commondenominator • EASIER – tools to enhance the mapping process, mappings build on other mappings

CF attributes NC basic attributes IRIDL attributes SWEET Ontologies CF Standard Names As Terms

CF attributes NC basic attributes IRIDL attributes SWEET Ontologies CF Standard Names As Terms SWEET as Terms Gazetteer Terms Search Terms IRIDL Terms

Sample Tool: Faceted Search http: //iridl. ldeo. columbia. edu/ontologies/query 2. pl? . . .

Sample Tool: Faceted Search http: //iridl. ldeo. columbia. edu/ontologies/query 2. pl? . . .

Distinctive Features of the search • Search terms are interrelated • terms that describe

Distinctive Features of the search • Search terms are interrelated • terms that describe the set of returns are displayed (spanning and not) • Returned items also have structure (subitems and superseded items are not shown)

Architectural Features of the search http: //iridl. ldeo. columbia. edu/ontologies/query 2. pl • Multiple

Architectural Features of the search http: //iridl. ldeo. columbia. edu/ontologies/query 2. pl • Multiple search structures possible • Multiple languages possible • Search structure is kept in the database, not in the code

RDF: framework for writing connections Triplets of • Subject • Property (or Predicate) •

RDF: framework for writing connections Triplets of • Subject • Property (or Predicate) • Object URI’s identify things, i. e. most of the above Namespaces are used as a convenient shorthand for the URI’s

Datatype Properties {WOA} dc: title “NOAA NODC WOA 01” {WOA} dc: description “NOAA NODC

Datatype Properties {WOA} dc: title “NOAA NODC WOA 01” {WOA} dc: description “NOAA NODC WOA 01: World Ocean Atlas 2001, an atlas of objectively analyzed fields of major ocean parameters at monthly, seasonal, and annual time scales. Resolution: 1 x 1; Longitude: global; Latitude: global; Depth: [0 m, 5500 m]; Time: [Jan, Dec]; monthly”

Object Properties {WOA} iridl: is. Container. Of {Grid-1 x 1}, {Grid-1 x 1} iridl:

Object Properties {WOA} iridl: is. Container. Of {Grid-1 x 1}, {Grid-1 x 1} iridl: is. Container. Of {Monthly}

WOA 01 diagram

WOA 01 diagram

Standard Properties {WOA} dcterm: has. Part {Grid-1 x 1}, {Grid-1 x 1} dcterm: has.

Standard Properties {WOA} dcterm: has. Part {Grid-1 x 1}, {Grid-1 x 1} dcterm: has. Part {MONTHLY} Alternatively {WOA} iridl: is. Container. Of {Grid-1 x 1}, {iridl: is. Container. Of} rdfs: sub. Property. Of {dcterm: has. Part}

netcdf/CF in RDF {SST} rdf: type {cfatt: non_coordinate_variable}, {SST} cfatt: standard_name {cf: sea_surface_temperature}, {SST}

netcdf/CF in RDF {SST} rdf: type {cfatt: non_coordinate_variable}, {SST} cfatt: standard_name {cf: sea_surface_temperature}, {SST} netcdf: has. Dimension {longitude} Object properties provide a framework for explicitly writing down relationships between data objects/components, e. g. vague meaning of nesting is made explicit Properties also can be related, since they are objects too

RDF Tools • • • Transport/Exchange (RDF/XML) Storage RDF APIs (Redland, Jena, Sesame) Query

RDF Tools • • • Transport/Exchange (RDF/XML) Storage RDF APIs (Redland, Jena, Sesame) Query (SPARQL, Se. RQL, …) Basic Semantics

Search Interface Term • http: //iri. columbia. edu/~benno/sampleterm. pdf

Search Interface Term • http: //iri. columbia. edu/~benno/sampleterm. pdf

Ontologies Use Conventions to connect concepts to established sets of concepts Generate additional “virtual”

Ontologies Use Conventions to connect concepts to established sets of concepts Generate additional “virtual” triples from the original set and semantics RDFS – some property/class semantics OWL – additional property/class semantics: more sophisticated (ontological) relationships

OWL Language for expressing ontologies, i. e. the semantics are very important. However, even

OWL Language for expressing ontologies, i. e. the semantics are very important. However, even without a reasoner to generate the implied RDF statements, OWL classes and properties represent a sophistication of the RDF Schema However, there is a serious split in world view from what we have been talking about: concepts as classes vs concepts as individuals

Faceted Search Explicated

Faceted Search Explicated

Search Interface • Items (datasets/maps) • Terms • Facets • Taxa

Search Interface • Items (datasets/maps) • Terms • Facets • Taxa

Search Interface Semantic API {item} dc: title dc: description rss: link iridl: icon dcterm:

Search Interface Semantic API {item} dc: title dc: description rss: link iridl: icon dcterm: is. Part. Of {item 2} dcterm: is. Replaced. By {item 2} {item} trm: is. Described. By {term} a {facet} of {taxa} of {trm: Term}, {facet} a {trm: Facet}, {taxa} a {trm: Taxa}, {term} trm: directly. Implies {term 2}

Faceted Search w/Queries http: //iridl. ldeo. columbia. edu/ontologies/query 2. pl? . . .

Faceted Search w/Queries http: //iridl. ldeo. columbia. edu/ontologies/query 2. pl? . . .

RDF Architecture queries Virtual (derived) RDF RDF RDF RDF RDF

RDF Architecture queries Virtual (derived) RDF RDF RDF RDF RDF

IRI RDF Architecture MMI Data Servers Ontologies JPL bibliography Start Point Standards Organizations RDF

IRI RDF Architecture MMI Data Servers Ontologies JPL bibliography Start Point Standards Organizations RDF Crawler RDFS Semantics Owl Semantics SWRL Rules Se. RQL CONSTRUCT Sesame Search Queries Search Interface Location Canonicalizer Time Canonicalizer

CF attributes NC basic attributes IRIDL attributes SWEET Ontologies CF Standard Names As Terms

CF attributes NC basic attributes IRIDL attributes SWEET Ontologies CF Standard Names As Terms SWEET as Terms Gazetteer Terms Search Terms IRIDL Terms

RDF is important for Open. DAP because • By embedding Open. DAP in an

RDF is important for Open. DAP because • By embedding Open. DAP in an RDF document, metadata (a. k. a. attributes) not understood by Open. DAP code are easily carried in a semantically-valid way • Explicit relationships between Open. DAP variables can cleanly solve netcdf common name vs Open. DAP GRID/MAP structures, while avoiding retransmission of common independent variables • Explicit mapping between the different data models of the different Open. DAP APIs • Build on language support of RDF objects

Embedded Open. DAP Ontology

Embedded Open. DAP Ontology

Topics/Issues • Open. DAP and RDF: can we transport data semantics without fixing the

Topics/Issues • Open. DAP and RDF: can we transport data semantics without fixing the entire schema? • netcdf/HDF and RDF: do we need noncontextual modeling in our metadata transport/storage? • Concepts as classes vs concepts as individuals • Sub-classes vs sub-categories