Developing Ontologies and more Peter Fox NCAR ESIP

  • Slides: 59
Download presentation
Developing Ontologies (and more) Peter Fox (NCAR) ESIP Winter Meeting (TIWG) January 9, 2008,

Developing Ontologies (and more) Peter Fox (NCAR) ESIP Winter Meeting (TIWG) January 9, 2008, Washington, D. C. 1

Ontology Spectrum Thesauri “narrower Catalog/ term” ID relation Terms/ glossary Informal is-a Selected Formal

Ontology Spectrum Thesauri “narrower Catalog/ term” ID relation Terms/ glossary Informal is-a Selected Formal Frames Logical is-a (properties)Constraints (disjointness, inverse, …) Formal Value instance Restrs. General Logical constraints Originally from AAAI 1999 - Ontologies Panel by Gruninger, Lehmann, Mc. Guinness, Uschold, Welty; – updated by Mc. Guinness. Description in: www. ksl. stanford. edu/people/dlm/papers/ontologies-come-of-age-abstract. html 2

Ontology - declarative knowledge • The triple: {subject-predicate-object} interferometer is-a optical instrument Fabry-Perot is-a

Ontology - declarative knowledge • The triple: {subject-predicate-object} interferometer is-a optical instrument Fabry-Perot is-a interferometer Optical instrument has focal length Optical instrument is-a instrument Instrument has instrument operating mode Data archive has measured parameter SO 2 concentration is-a concentration Concentration is-a parameter 3

Semantic Web Layers 4 http: //www. w 3. org/2003/Talks/1023 -iswc-tbl/slide 26 -0. html, http:

Semantic Web Layers 4 http: //www. w 3. org/2003/Talks/1023 -iswc-tbl/slide 26 -0. html, http: //flickr. com/photos/pshab/291147522/

Terminology • Ontology (n. d. ). The Free On-line Dictionary of Computing. http: //dictionary.

Terminology • Ontology (n. d. ). The Free On-line Dictionary of Computing. http: //dictionary. reference. com/browse/ontology – An explicit
formal specification of how to represent the objects, concepts
and other entities that are assumed to exist in some area of
interest and the relationships that hold among them. • Semantic Web – An extension of the current web in which information is given welldefined meaning, better enabling computers and people to work in cooperation, www. semanticweb. org – Primer: http: //www. ics. forth. gr/isl/swprimer/ • Languages – – – OWL 1. 0 (Lite, DL, Full) - Web Ontology Language (W 3 C) RDF - Resource Description Framework (W 3 C) OWL-S/SWSL - Web Services (W 3 C) WSMO/WSML - Web Services (EC/W 3 C) SWRL - Semantic Web Rule Language, RIF- Rules Interchange Format Editors: Protégé, SWOOP, Co. E, VOM, Medius, SWe. DE, … 5

OWL and RDF • OWL – Lite – DL – Full • RDF •

OWL and RDF • OWL – Lite – DL – Full • RDF • Services – – OWL-S SWSL WSML SAWSDL - (WSDL-S) • Rules – SWRL 6

Developing Ontologies • Approach: – Bottom-up – Top-down (upper-level or foundational) – Mid-level (use

Developing Ontologies • Approach: – Bottom-up – Top-down (upper-level or foundational) – Mid-level (use case) • • Using tools Coding and testing Iterating Maintaining and evolving (curation, preservation) 7

GRDDL - bottom up • GRDDL - Gleaning Resource Descriptions from Dialects of Languages

GRDDL - bottom up • GRDDL - Gleaning Resource Descriptions from Dialects of Languages • Pretty much = “XML/XHTML (for e. g. ) into RDF via XSLT” • Good support, e. g. Jena • Handles microformats • Active community • How to categorize, use, re-use (parts of)? 8

Collecting • RDFa extends XHTML by: – extending the link and meta to include

Collecting • RDFa extends XHTML by: – extending the link and meta to include child elements – add metadata to any elements (a bit like the class in micro-formats, but via dedicated properties) – It is very similar to micro-formats, but with more rigor: • it is a general framework (instead of an “agreement” on the meaning of, say, a class attribute value) • terminologies can be mixed more easily • ATOM (used with RSS) 9

Foundational Ontologies CONTENTS § General concepts and relations that apply in all domains physical

Foundational Ontologies CONTENTS § General concepts and relations that apply in all domains physical object, process, event, …, inheres, participates, … § Rigorously defined formal logic, philosophical principles, highly structured § Examples DOLCE, BFO, GFO, SUMO, CYC, (Sowa) 10 Courtesy: Boyan Brodaric

Foundational Ontologies PURPOSE: help integrate domain ontologies “…and then there was one…” Foundational ontology

Foundational Ontologies PURPOSE: help integrate domain ontologies “…and then there was one…” Foundational ontology Geology ontology Struc Rock ontology Geophysics ontology Marine ontology Water ontology Planetary ontology 11 Courtesy: Boyan Brodaric

Foundational Ontologies PURPOSE: help organize domain ontologies “…a place for everything, and everything in

Foundational Ontologies PURPOSE: help organize domain ontologies “…a place for everything, and everything in its place…” Foundational ontology shale rock formatio n lithification 12 Courtesy: Boyan Brodaric

Problem scenario § Little work done on linking foundational ontologies with geoscience ontologies §

Problem scenario § Little work done on linking foundational ontologies with geoscience ontologies § Such linkage might benefit various scenarios requiring cross-disciplinary knowledge, e. g. : water budgets: groundwater (geology) and surface water (hydro) hazards risk: hazard potential (geology, geophysics) and items at threat (infrastructure, people, environment, economic) health: toxic substances (geochemistry) and people, wildlife many others… 13 Courtesy: Boyan Brodaric

DOLCE 14

DOLCE 14

DOLCE + SWEET DOLCE = SWEET < SWEET Physical-body Bodyof. Ground, Bodyof. Water, …

DOLCE + SWEET DOLCE = SWEET < SWEET Physical-body Bodyof. Ground, Bodyof. Water, … Material-Artifact Infrastructure, Dam, Product, … Physical-Object Living. Thing, Marine. Animal Amount-of-Matter § full coverage rich relations home for orphans single superclasses Substance Human. Activity Physical-Phenomenon Phenomena Process State. Of. Matter Quality Quantity, Moisture, … Physical-Region Basalt, … Temporal-Region Ordovician, … Courtesy: Boyan Brodaric Benefits § Issues individuals (e. g. Planet Earth) roles (contaminant) features (Sea. Floor) 15

Conclusions § Surprisingly good fit amongst ontologies so far: no show-stopper conflicts, a few

Conclusions § Surprisingly good fit amongst ontologies so far: no show-stopper conflicts, a few difficult conflicts § DOLCE richness benefits geoscience ontologies good conceptual foundation helps clear some existing problems § Unresolved issues in modeling science entities modeling classifications, interpretations, theories, models, … § Same procedure with Geo. Sci. ML 16 Courtesy: Boyan Brodaric

SUMO - Standard Upper Merged Ontology • • Physical • Object • Self. Connected.

SUMO - Standard Upper Merged Ontology • • Physical • Object • Self. Connected. Object • Continuous. Object • Corpuscular. Object • Collection • Process Abstract • Set. Class • Relation • Proposition • Quantity • Number • Physical. Quantity • Attribute 17

18

18

19

19

Using SNAP/ SPAN 20

Using SNAP/ SPAN 20

Geo. Sci. Ont? 21

Geo. Sci. Ont? 21

22

22

Using SWEET • Plug-in (import) domain detailed modules • Lots of classes, few relations

Using SWEET • Plug-in (import) domain detailed modules • Lots of classes, few relations (properties) 23

Mix-n-Match • The IRI example: – Collect a lot of different ontologies representing different

Mix-n-Match • The IRI example: – Collect a lot of different ontologies representing different terms, levels of concepts, etc. into a base form: RDF – See Benno’s talk in session 1 b. • MMI • Others 24

NC basic attributes CF attributes IRIDL attributes/objects CF data objects SWEET Ontologies (OWL) CF

NC basic attributes CF attributes IRIDL attributes/objects CF data objects SWEET Ontologies (OWL) CF Standard Names (RDF object) Location CF Standard Names As Terms IRIDL Terms SWEET as Terms Gazetteer Terms Search Terms Blumenthal 25

IRI RDF Architecture MMI Data Servers Ontologies JPL bibliography Start Point Standards Organizations RDF

IRI RDF Architecture MMI Data Servers Ontologies JPL bibliography Start Point Standards Organizations RDF Crawler RDFS Semantics Owl Semantics SWRL Rules Se. RQL CONSTRUCT Sesame Location Canonicalizer Time Canonicalizer Search Queries Blumenthal Search Interface 26

Mid-Level: Developing ontologies • Use cases and small team (7 -8; 2 -3 domain

Mid-Level: Developing ontologies • Use cases and small team (7 -8; 2 -3 domain experts, 2 knowledge experts, 1 software engineer, 1 facilitator, 1 scribe) • Identify classes and properties (leverage controlled vocab. ) – Start with narrower terms, generalize when needed or possible – Adopt a suitable conceptual decomposition (e. g. SWEET) – Import modules when concepts are orthogonal • Review, vet, publish • Only code them (in RDF or OWL) when needed (CMAP, …) • Ontologies: small and modular 27

Use Case example • Plot the neutral temperature from the Millstone-Hill Fabry Perot, operating

Use Case example • Plot the neutral temperature from the Millstone-Hill Fabry Perot, operating in the vertical mode during January 2000 as a time series. • Plot the neutral temperature from the Millstone. Hill Fabry Perot, operating in the vertical mode during January 2000 as a time series. • Objects: – – – – Neutral temperature is a (temperature is a) parameter Millstone Hill is a (ground-based observatory is a) observatory Fabry-Perot is a interferometer is a optical instrument is a instrument Vertical mode is a instrument operating mode January 2000 is a date-time range Time is a independent variable/ coordinate Time series is a data plot is a data product 28

Class and property example • Parameter – Has coordinates (independent variables) • Observatory –

Class and property example • Parameter – Has coordinates (independent variables) • Observatory – Operates instruments • Instrument – Has operating mode • Instrument operating mode – Has measured parameters • Date-time interval • Data product 29

30

30

31

31

32

32

Higher level use case • Find data which represents the state of the neutral

Higher level use case • Find data which represents the state of the neutral atmosphere above 100 km, toward the arctic circle at any time of high geomagnetic activity 33

Translating the Use-Case - nonmonotonic? Geo. Magnetic. Activity has Proxy. Representation Input Geophysical. Index

Translating the Use-Case - nonmonotonic? Geo. Magnetic. Activity has Proxy. Representation Input Geophysical. Index is a Proxy. Representation (in Physical properties: State of Realm of Neutral Atmosphere) neutral atmosphere Kp is a Geophysical. Index Spatial: has. Temporal. Domain: “daily” • Above 100 km has. High. Threshold: xsd_number = 8 • Toward arctic circle (above 45 N) Date/time when KP => 8 Conditions: Specification needed for query to CEDARWEB Instrument Parameter(s) Operating Mode Observatory Date/time • High geomagnetic activity Action: Return Data Return-type: data 34

Translating the Use-Case has. Physical. Properties: Neutral. Temperature, Neutral Wind, etc. ctd. has. Spatial.

Translating the Use-Case has. Physical. Properties: Neutral. Temperature, Neutral Wind, etc. ctd. has. Spatial. Domain: [0, 360], [0, 180], [100, 150] Neutral. Atmosphere is a sub. Realm of Terrestrial. Atmosphere has. Temporal. Domain: Specification needed for Input query to CEDARWEB Neutral. Temperature is a Temperature (which) is a Parameter Physical properties: State of Instrument neutral atmosphere Spatial: Above 100 km Geo. Magnetic. Activity has Proxy. Representation Toward arctic circle (above Geophysical. Index is a 45 N) Proxy. Representation (in Conditions: Realm of Neutral Atmosphere) High geomagnetic Kp is a Geophysical. Index activity has. Temporal. Domain: “daily” Action: Return Data has. High. Threshold: xsd_number = 8 Date/time when KP => 8 Parameter(s) Fabry. Perot. Interferometer is a Interferometer, (which) is a Optical. Operating Instrument (which) is a Mode Instrument Observatory has. Filter. Central. Wavelength: Wavelength has. Lower. Bound. Formation. Height: Height Date/time Arctic. Circle is a Geographic. Region Return-type: data has. Latitude. Boundary: has. Latitude. Upper. Boundary: 35

Tools - Using Protégé 36

Tools - Using Protégé 36

Creating Ontologies - visual • UML - new release of ODM/MOF – Ontology Definition

Creating Ontologies - visual • UML - new release of ODM/MOF – Ontology Definition Metamodel/Meta Object Facility (OMG) for UML – Provides standardized notation • CMAP Ontology Editor (concept mapping tool from IHMC) – Drag/drop visual development of classes, subclass (is-a) and property relationship – Read and writes OWL – Formal convention (OWL/RDF tags, etc. ) • White board, text file 37

Using CMAP/COE 38

Using CMAP/COE 38

39

39

Is OWL the only option? No… • SKOS - Simple Knowledge Organization Scheme •

Is OWL the only option? No… • SKOS - Simple Knowledge Organization Scheme • Annotations (RDFa) • Atom • Natural Language (read results from a web search and transform to a usable form) – CL (common logic) – Rabbit, e. g. Shellfish. Course is a Meal Course that (if has drink) always has drink Potable Liquid that has Full body and which either has Moderate or Strong flavour 40 – PENG (processable English)

Is OWL the only option II? No… • Natural Language (NL) – Read results

Is OWL the only option II? No… • Natural Language (NL) – Read results from a web search and transform to a usable form – Find/filter out inconsistencies, concepts/relations that cannot be represented • Popular options – CLCE (common logic controlled english) – Rabbit, e. g. Shellfish. Course is a Meal Course that (if has drink) always has drink Potable Liquid that has Full body and which either has Moderate or Strong flavour – PENG (processable English) • Really need PSCI - process-able science 41

Creating Ontologies - verbal • Translating use cases • E. g. Find data which

Creating Ontologies - verbal • Translating use cases • E. g. Find data which represents the state of the neutral atmosphere above 100 km, toward the arctic circle at any time of high geomagnetic activity • Can this be expressed as an ontology? – CLCE, Rabbit, PENG, Sydney syntax • Notice something about the next examples? 42

Sydney syntax If X has Y as a father then Y is the only

Sydney syntax If X has Y as a father then Y is the only father of X. The class person is equivalent to male or female, and male and female are mutually exclusive. equivalent to The classes male and female are mutually exclusive. The class person is fully defined as anything that is a male or a female. 43

PENG - Processible English 1. If X is a research programmer then X is

PENG - Processible English 1. If X is a research programmer then X is a programmer. 2. Bill Smith is a research programmer who works at the CLT. 3. Who is a programmer and works at the CLT? 44

CLCE - Common Logic Controlled English CLCE: If a set x is the set

CLCE - Common Logic Controlled English CLCE: If a set x is the set of (a cat, a dog, and an elephant), then the cat is an element of x, the dog is an element of x, and the elephant is an element of x. PC: ~(∃x: Set)(∃x 1: Cat)(∃x 2: Dog)(∃x 3: Elephant)(Set(x, x 1, x 2, x 3) ∧ ~(x 1∈x ∧ x 2∈x ∧ x 3∈x)) 45

Use Case • Provide a decision support capability for an analyst to determine an

Use Case • Provide a decision support capability for an analyst to determine an individual’s susceptibility to avian flu without having to be precise in terminology (-nyms) 46

47

47

48

48

Using Th. Manager 49

Using Th. Manager 49

Services • Ontologies of services, provides: – What does the service provide for prospective

Services • Ontologies of services, provides: – What does the service provide for prospective clients? The answer to this question is given in the "profile, " which is used to advertise the service. To capture this perspective, each instance of the class Service presents a Service. Profile. – How is it used? The answer to this question is given in the "process model. " This perspective is captured by the Service. Model class. Instances of the class Service use the property described. By to refer to the service's Service. Model. – How does one interact with it? The answer to this question is given in the "grounding. " A grounding provides the needed details about transport protocols. Instances of the class Service have a supports property referring to a 50 Service. Grounding.

Developing a service ontology • Use case: find and display in the same projection,

Developing a service ontology • Use case: find and display in the same projection, sea surface temperature and land surface temperature from a global climate model. • Find and display in the same projection, sea surface temperature and land surface temperature from a global climate model. • Classes/ concepts: – – – – Temperature Surface (sea/ land) Model Climate Global Projection Display … 51

Service ontology • • • • Climate model is a model Model has domain

Service ontology • • • • Climate model is a model Model has domain Climate Model has component representation Land surface is-a component representation Ocean is-a component representation Sea surface is part of ocean Model has spatial representation (and temporal) Spatial representation has dimensions Latitude-longitude is a horizontal spatial representation Displaced pole is a horizontal spatial representation Ocean model has displaced pole representation Land surface model has latitude-longitude representation Lambert conformal is a geographic spatial representation Reprojection is a transform between spatial representation …. 52

Service ontology • A sea surface model has grid representation displaced pole and land

Service ontology • A sea surface model has grid representation displaced pole and land surface model has grid representation latitudelongitude and both must be transformed to Lambert conformal for display 53

Best practices • Ontologies/ vocabularies must be shared and reused - swoogle. umbc. edu,

Best practices • Ontologies/ vocabularies must be shared and reused - swoogle. umbc. edu, www. planetont. org • Examine ‘core vocabularies’ to start with – SKOS Core: about knowledge systems – Dublin Core: about information resources, digital libraries, with extensions for rights, permissions, digital right management – FOAF: about people and their organizations – DOAP: on the descriptions of software projects – DOLCE seems the most promising to match science ontologies • Go “Lite” as much as possible, then DL and only if you have to Full - balancing expressibility vs. implementability • Minimal properties to start, add only when needed 54

Tutorial Summary • Many different options for ontology development and encoding • Tools are

Tutorial Summary • Many different options for ontology development and encoding • Tools are in reasonable shape, no killer-tool • Best practices DO exist – PLEASE DO NOT just start coding OWL! • Use case should drive the functional requirements of both your ontology and how you will ‘build’ one • PARTNER with someone already familiar 55

More information • OWL-S - http: //www. w 3. org/Submission/OWL-S • SWSO/F/L - Semantic

More information • OWL-S - http: //www. w 3. org/Submission/OWL-S • SWSO/F/L - Semantic Web Services Ontology/Framework/Language http: //www. w 3. org/Submission/SWSF/ • WSMO/X/L - Web Services Modeling Ontology/Exection/Language http: //www. w 3. org/Submission/WSMX/ www. wsmo. org, www. wsmx. org • SAWSDL - (WSDL-S) 56

Other tools • Reasoners – Pellet, Racer, Medius KBS, FACT++, fuzzy. DL, KAON 2,

Other tools • Reasoners – Pellet, Racer, Medius KBS, FACT++, fuzzy. DL, KAON 2, MSPASS, Qu. Onto • Query Languages – SPARQL, XQUERY, Se. RQL, OWL-QL, RDFQuery • Other Tools for Semantic Web – – Search: SWOOGLE swoogle. umbc. edu Collaboration: www. planetont. org Other: Jena, Se. SAME/SAIL, Mulgara, Eclipse, KOWARI Semantic wiki: Onto. Wiki, Semantic. Media. Wiki 57

Editors • Protégé (http: //protégé. stanford. edu) • SWOOP (http: //mindswap. org/2004/SWOOP) • Altova

Editors • Protégé (http: //protégé. stanford. edu) • SWOOP (http: //mindswap. org/2004/SWOOP) • Altova Semantic. Works (http: //www. altova. com/download/semanticworks/se mantic_web_rdf_owl_editor. html) • SWe. DE (http: //owleclipse. projects. semwebcentral. org/Install. Swede. ht ml), goes with Eclipse • Medius • Top. Braid Composer and other commercial tools • Visual Ontology Modeler (VOM) - Sandpiper • CMAP Ontology Editor (COE) (http: //cmap. ihmc. us/coe) 58

What about Earth Science? • SWEET (Semantic Web for Earth and Environmental Terminology) –

What about Earth Science? • SWEET (Semantic Web for Earth and Environmental Terminology) – http: //sweet. jpl. nasa. gov – based on GCMD terms – modular using faceted and integrative concepts • VSTO (Virtual Solar-Terrestrial Observatory) – http: //vsto. hao. ucar. edu – captures observational data (from instruments) – modular using domains • MMI – http: //marinemetadata. org – captures aspects of marine data, ocean observing systems – partly modular, mostly by developed project • Geo. Sci. ML – http: //www. opengis. net/Geo. Sci. ML/ – is a GML (Geography ML) application language for Geoscience – modular, in ‘packages’ 59