Next Generation Scientific Digital Libraries or The Semantic
Next Generation Scientific Digital Libraries (or The Semantic Web and Digital Libraries as Knowledge Systems) Deborah Mc. Guinness Co-Director Knowledge Systems, Artificial Intelligence Laboratory Stanford University dlm@ksl. stanford. edu Mc. Guinness JCIS June 11, 2005
Outline • Introduction • The Semantic Web, Ontologies, and the Ontology Web Language • Selected Technical Benefits of Semantic Technologies • Discussion and Directions Mc. Guinness JCIS June 11, 2005
Semantic Web Perspectives • The Semantic Web means different things to different people. It is multi-dimensional – Distributed data access – Inference – Data Integration – Logic – Services – Search (based on term meaning) – Configuration – Agents –… • Different users value these dimensions differently • Theme: Machine-operational declarative specification of the meaning of terms Mc. Guinness JCIS June 11, 2005
Semantic Web Layers Ontology Level – Languages (CLASSIC, DAML-ONT, DAML+OIL, OWL, …) – Environments (Find. UR, Chimaera, Onto. Builder/Server, Sandpiper Tools, …) – Standards (NAPLPS, …, W 3 C’s Web. Ont, W 3 C’s Semantic Web Best Practices, EU/US Joint Committee, OMG ODM, … Rules – SWRL (previously CLASSIC Rules, explanation environment, extensibility issues, contracts, …) Logic – Description Logics Proof – PML, Inference Web Services and Infrastructure Trust – IWTrust, Policy encodings, … http: //www. w 3. org/2004/Talks/0412 -RDF-functions/slide 4 -0. html Mc. Guinness JCIS June 11, 2005
Semantic Web Statements • The Semantic Web is made up of individual statements subject predicate object • The subject and predicate are Uniform Resource Identifiers (URIs) – the object can be a URI or an optionally typed literal value works. For collaborates. With #Peter works. For #Deborah #Stanford works. For #NCAR surname “Fox” Mc. Guinness JCIS June 11, 2005 #Mc. Guinness Assoc “Mc. Guinness”
Ontology Spectrum Catalog/ ID Thesauri “narrower term” relation Terms/ glossary Informal is-a Frames General Formal is-a (properties) Logical constraints Formal instance Disjointness, Value Inverse, part. Restrs. of… Ontology languages such as DAML+OIL, OWL can be used to encode the spectrum Originally from AAAI 1999 - Ontologies Panel – updated by Mc. Guinness JCIS June 11, 2005
General Nature of Descriptions class superclass number/card restrictions Roles/ properties value restrictions a WINE a LIQUID a POTABLE general categories grape: chardonnay, . . . [>= 1] sugar-content: dry, sweet, off-dry color: red, white, rose price: a PRICE winery: a WINERY structured components grape dictates color (modulo skin) harvest time and sugar are related interconnections between parts Mc. Guinness JCIS June 11, 2005
DAML/OWL Language Web Languages XML • Extends vocabulary of RDF/S XML and RDF/S DAML-ONT • Rich ontology representation language DAML+OIL • Language features OWL OIL chosen for efficient Formal Foundations implementations Frame Systems Description Logics FACT, CLASSIC, DLP, … Mc. Guinness JCIS June 11, 2005
Selected Technical Benefits 1. 2. 3. 4. 5. 6. 7. 8. Integrating Multiple Data Sources Semantic Drill Down / Focused Perusal Statements about Statements Inference Translation Smart (Focused) Search Smarter Search … Configuration Proof Mc. Guinness JCIS June 11, 2005
1: Integrating Multiple Data Sources • The Semantic Web lets us merge statements from different sources • The RDF Graph Model allows programs to use data uniformly regardless of the source • Figuring out where to find such data is a motivator for Semantic Web Services #US currency name “United States” #USD telephone. Code “ 1” Different line & text colors represent different data sources Mc. Guinness JCIS June 11, 2005
2: Drill Down /Focused Perusal • The Semantic Web uses Uniform Resource Identifiers …#Deborah (URIs) to name things • These can typically be resolved to get more information about the resource works. For • This essentially creates a web of data analogous to the web of text created by the World Internet Wide Web • Ontologies are represented using the same structure as. . . #Mc. Guinness content – We can resolve class and property URIs to learn about the ontology Assoc type …#California located. In . . . #Stanford type . . . #Company …#University Mc. Guinness JCIS June 11, 2005
3: Statements about Statements • The Semantic Web allows us to make statements about statements – Timestamps – Provenance / Lineage – Authoritativeness / Probability / Uncertainty – Security classification – … #Estimate #US type population year 2003 290342554 • This is an unsung virtue of the Semantic Web Mc. Guinness JCIS June 11, 2005 From CIA World Factbook
4: Inference • The formal foundations of the Semantic Web allow us to infer additional (implicit) statements that are not explicitly made • Unambiguous semantics allow question answerers to infer that objects are the same, objects are related, objects have certain restrictions, … • SWRL allows us to make additional inferences beyond those provided by the ontology Mc. Guinness sibling #Joe has. Brother #Louise sibling daughter. Of has. Uncle has. Mother has. Child JCIS June 11, 2005 #Deborah
5: Translation • While encouraging sharing, the Semantic Web allows multiple URIs to refer to the same thing • There are multiple levels of mapping – – Classes Properties Instances Ontologies #car type ont 1: country ont 1: Car fips: UK #car type • OWL supports equivalence and specialization; SWRL allows more complex mappings Mc. Guinness ont 2: Vehicle JCIS June 11, 2005 ont 2: country iso: GB
6: Smart (Focused) Search • The Semantic Web associates 1 or more classes with each object • We can use ontologies to enhance search by: – – Query expansion Sense disambiguation Type with restrictions …. Mc. Guinness JCIS June 11, 2005
Mc. Guinness JCIS June 11, 2005
7: Smarter Search / Configuration Mc. Guinness JCIS June 11, 2005
KSL Wine Agent Semantic Web Integration Example Uses emerging web standards to enable smart web applications Given a meal description • Deborah’s Specialty Describe matching wines • White, Dry, Full bodied… Retrieve some specific options from web • Forman Chardonnay from DLM’s cellar, Three. Steps from wine. com, …. • Info: http: //www. ksl. stanford. edu/people/dlm/webont/wine. Agent/ Mc. Guinness JCIS June 11, 2005
KSL Wine Agent Semantic Web Integration Technology OWL for representing a domain ontology of foods, wines, their properties, and relationships between them JTP theorem prover for deriving appropriate pairings DQL/OWL QL for querying a knowledge base Inference Web for explaining and validating answers (descriptions or instances) Web Services for interfacing with vendors Connections to online web agents/information services Utilities for conducting and caching the above transactions Mc. Guinness JCIS June 11, 2005
Mc. Guinness JCIS June 11, 2005
8: Proof • The logical foundations of the Semantic Web allow us to construct proofs that can be used to improve transparency, understanding, and trust • Proof and Trust are ongoing research areas for the Semantic Web: e. g. , See PML and Inference Web Mc. Guinness #W 3 C has. Member #Acme has. Employee #Bob “Employees of member companies can access W 3 C’s content” JCIS June 11, 2005
Scientific Digital Libraries Scientists should be able to access a global, distributed knowledge base of scientific data that: • appears to be integrated • appears to be locally available • is easy to search But… data is obtained by multiple instruments, using various protocols, in differing (possibly unfamiliar) vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed Mc. Guinness JCIS June 11, 2005
Future Science Digital Libraries Repositories of data with markup and provenance that enables… - sharing data AND tools with distributed colleagues - understanding assumptions, constraints, and enough information to determine applicability and reuse - research data and experiment composition and dependence - consistency and validation checking and more… Current and future repositories are poised to change the nature of how science is done by supporting interoperability and sharing at new levels. Projects like the Virtual Solar Terrestrial Observatory, GEON, etc. use semantic web technology to enable next generation Mc. Guinness JCIS June 11, 2005 digital scientific libraries
Conclusion • Semantic Web Languages and Tools are ready for use (OWL, OWL-S, Cerebra, Sandpiper, …) • Predecessor technology (description logics etc. ) have been in use for decades • Current ontologies and tools being used in science: – – – – Gene Ontology (GO) NCI and UMLS SWEET (Semantic Web for Earth and Environmental Terminology ) Immune Epitope Data. Base GEON Virtual Solar Terrestrial Observatory … • The time is NOW to work together towards next generation semantically-enabled interoperable systems Mc. Guinness JCIS June 11, 2005
Resources Selected Papers: - Mc. Guinness. Ontologies come of age, 2003 - Das, Wei, Mc. Guinness, Industrial Strength Ontology Evolution Environments, 2002. - Kendall, Dutra, Mc. Guinness. Towards a Commercial Strength Ontology Development Environment, 2002. - Mc. Guinness Description Logics Emerge from Ivory Towers, 2001. - Mc. Guinness. Ontologies and Online Commerce, 2001. - Mc. Guinness. Conceptual Modeling for Distributed Ontology Environments, 2000. - Mc. Guinness, Fikes, Rice, Wilder. An Environment for Merging and Testing Large Ontologies, 2000. - Brachman, Borgida, Mc. Guinness, Patel-Schneider. Knowledge Representation meets Reality, 1999. - Mc. Guinness. Ontological Issues for Knowledge-Enhanced Search, 1998. Selected Tutorials: -Smith, Welty, Mc. Guinness. OWL Web Ontology Language Guide, 2004. -Noy, Mc. Guinness. Ontology Development 101: A Guide to Creating your First Ontology. 2001. -Brachman, Mc. Guinness, Resnick, Borgida. How and When to Use a KL-ONE-like System, 1991. Languages, Environments, Software: - OWL - http: //www. w 3. org/TR/owl-features/ , http: //www. w 3. org/TR/owl-guide/ - Inference Web - http: //www. ksl. stanford. edu/software/iw/ - Wine Agent - http: //www. ksl. stanford. edu/people/dlm/webont/wine. Agent/ - Chimaera - http: //www. ksl. stanford. edu/software/chimaera/ - Find. UR - http: //www. research. att. com/people/~dlm/findur/ - TAP – http: //tap. stanford. edu/ - OWL-QL - http: //www. ksl. stanford. edu/projects/owl-ql/ - Cerebra (formerly Network Inference) – http: //www. cerebra. com - Sandpiper Software – http: //www. sandsoft. com - Virtual Solar Terrestrial Observatory - http: //vsto. hao. ucar. edu/ Mc. Guinness JCIS June 11, 2005
EXTRAS Mc. Guinness JCIS June 11, 2005
OWL Mc. Guinness JCIS June 11, 2005
OWL Sublanguages • OWL Lite supports users primarily needing a classification hierarchy and simple constraint features. (For example, while it supports cardinality constraints, it only permits cardinality values of 0 or 1. It should be simpler to provide tool support for OWL Lite than its more expressive relatives, and provides a quick migration path for thesauri and other taxonomies. ) • OWL DL supports users who need maximum expressiveness while their reasoning systems maintain computational completeness (all conclusions are guaranteed to be computed) and decidability (all computations will finish in finite time). OWL DL includes all OWL language constructs, but they can be used only under certain restrictions (for example, while a class may be a subclass of many classes, a class cannot be an instance of another class). OWL DL is named for its correspondence with description logics. • OWL Full supports users who want maximum expressiveness and the syntactic freedom of RDF with no computational guarantees. For example, in OWL Full a class can be treated simultaneously as a collection of individuals and as an individual in its own right. OWL Full allows an ontology to augment the meaning of the pre-defined (RDF or OWL) vocabulary. It is unlikely that any complete and efficient reasoner will be able to support every feature of OWL Full. Mc. Guinness JCIS June 11, 2005
OWL Lite Features • • • RDF Schema Features – Class, rdfs: sub. Class. Of , Individual – rdf: Property, rdfs: sub. Property. Of – rdfs: domain , rdfs: range Equality and Inequality – equivalent. Class , equivalent. Property , same. As – different. From – All. Different , distinct. Members Restricted Cardinality – min. Cardinality, max. Cardinality (restricted to 0 or 1) – cardinality (restricted to 0 or 1) Property Characteristics – inverse. Of , Transitive. Property , Symmetric. Property – Functional. Property(unique) , Inverse. Functional. Property – all. Values. From, some. Values. From (universal and existential local range restrictions) Datatypes – Following the decisions of RDF Core. Header Information – imports , Dublin Core Metadata , version. Info Mc. Guinness JCIS June 11, 2005
OWL Features • Class Axioms – – • one. Of (enumerated classes) disjoint. With equivalent. Class applied to class expressions rdfs: sub. Class. Of applied to class expressions Boolean Combinations of Class Expressions – union. Of – intersection. Of – complement. Of • Arbitrary Cardinality – min. Cardinality – max. Cardinality – cardinality • Filler Information – has. Value Descriptions can include specific value information Mc. Guinness JCIS June 11, 2005
OWL Lite and OWL • Overview: http: //www. w 3. org/TR/owl-features/ • Guide: http: //www. w 3. org/TR/owl-guide/ • Reference: http: //www. w 3. org/TR/owl-ref/ • Semantics and Abstract Syntax: http: //www. w 3. org/TR/owl-absyn/ Mc. Guinness JCIS June 11, 2005
Virtual Solar Terrestrial Observatories Mc. Guinness JCIS June 11, 2005
Background Scientists should be able to access a global, distributed knowledge base of scientific data that: • appears to be integrated • appears to be locally available But… data is obtained by multiple instruments, using various protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or nonexistent) meta-data. It may be inconsistent, incomplete, evolving, and distributed Mc. Guinness JCIS June 11, 2005
Virtual Observatories Make data and tools quickly and easily accessible to a wide audience. Operationally, virtual observatories need to find the right balance of data/model holdings, portals and client software that a researchers can use without effort or interference as if all the materials were available on his/her local computer using the user’s preferred language. They are likely to provide controlled vocabularies that may be used for interoperation in appropriate domains along with database interfaces for access and storage and “smart” tools for evolution and maintenance. Mc. Guinness JCIS June 11, 2005
Virtual Solar Terrestrial Observatory (VSTO) • a distributed, scalable education and research environment for searching, integrating, and analyzing observational, experimental, and model databases. • subject matter covers the fields of solar, solar-terrestrial and space physics • it provides virtual access to specific data, model, tool and material archives containing items from a variety of space- and ground-based instruments and experiments, as well as individual and community modeling and software efforts bridging research and educational use • 3 year NSF-funded project in first year Mc. Guinness JCIS June 11, 2005
Inference Web and Explanation Mc. Guinness JCIS June 11, 2005
Inference Web Framework for explaining reasoning tasks by storing, exchanging, combining, annotating, filtering, segmenting, comparing, and rendering proofs and proof fragments provided by multiple distributed reasoners. • OWL-based Proof Markup Language (PML) specification as an interlingua for proof interchange • IWExplainer for generating and presenting interactive explanations from PML proofs providing multiple dialogues and abstraction options • IWBrowser for displaying (distributed) PML proofs • IWBase distributed repository of proof-related meta-data such as inference engines/rules/languages/sources • Integrated with theorem provers, text analyzers, web http: //iw. stanford. edu services, … Mc. Guinness JCIS June 11, 2005
SW Questions & Answers Users can explore extracted entities and relationships, create new hypothesis, ask questions, browse answers and get explanations for answers. A question An answer An abstracted explanation (this graphical interface done by Batelle supported by KSL) Mc. Guinness JCIS June 11, 2005 A context for explaining the answer
Browsing Proofs The proof associated with an answer can be browsed in multiple formats. Menu to switch between Graphical/HTML Proof Styles Proof Rendered in Graphical Style Provenance Information associated with a selected Node. Set Mc. Guinness JCIS June 11, 2005
Selected Semantic Web Tools Mc. Guinness JCIS June 11, 2005
Protégé • Open source ontology editor from Stanford Medical Informatics – Large user community • Good GUI interface for subject-matter experts • Extra features – SWRL support – PROMPT versioning • http: //protege. stanford. edu Mc. Guinness JCIS June 11, 2005
Cerebra • Commercial OWL DL tools – Cerebra Construct • Ontology engineering and external source mapping within a familiar MS Visio framework – Cerebra Server • Commercial-grade inference platform, providing industry-standard query, high-performance inference and management capabilities with emphasis on scalability, availability, robustness and 100% correctness. Based on initial work from University of Manchester – CEREBRA Repository • Collaborative object repository for metadata, vocabulary, security and policy management • http: //www. cerebra. com Mc. Guinness JCIS June 11, 2005
Medius / Sandpiper • Visual Ontology Modeler – UML-based modeling tool – Add-in to Rational Rose – Produces RDF, OWL, DAML, UML, … • Medius Knowledge Brokering Suite • OMG Ontology Definition Metamodel (ODM) • http: //www. sandsoft. com Mc. Guinness JCIS June 11, 2005
SWOOP • Hypermedia-based open source ontology editor – Includes an interface to the Pellet OWL DL reasoner • http: //www. mindswap. org/2004/SWOOP/ Mc. Guinness JCIS June 11, 2005
Pellet • Open source Java OWL DL reasoner – API supports • • • Species validation (OWL Lite/DL/Full) Consistency checking Classification Entailment Query • http: //www. mindswap. org/2003/pellet/ Mc. Guinness JCIS June 11, 2005
SNOBASE • Ontology management system from IBM – Ontology Directory – Query capability – JOBC API • http: //www. alphaworks. ibm. com/tech/snobase Mc. Guinness JCIS June 11, 2005
Jena • Open source API from HP Labs UK • Most popular Java API – Parser – Serializer • Extra features – – Persistence (RDBMS) Query (RDQL) Reasoning Rule Engine • http: //www. hpl. hp. com/semweb/ Mc. Guinness JCIS June 11, 2005
Sweet. Rules • Open source rule framework • Executes SWRL and Rule. ML using a variety of rule engines – – Common. Rules XSB Prolog JESS Jena 2 • Translates between various rule formats • http: //sweetrules. projects. semwebcentral. org Mc. Guinness JCIS June 11, 2005
Sem. Web. Central • Open source software development site dedicated to the Semantic Web – 79+ projects – 257+ developers • Select projects by workflow or other attributes • http: //semwebcentral. org Mc. Guinness JCIS June 11, 2005
Other Tool Resources • Dave Beckett’s RDF Resource Guide – http: //www. ilrt. bris. ac. uk/discovery/rdf/resources/ • Michael Denny’s Survey of Ontology Tools – http: //www. xml. com/pub/a/2004/07/14/onto. html Mc. Guinness JCIS June 11, 2005
More Info Deborah Mc. Guinness dlm@ksl. stanford. edu http: //www. ksl. stanford. edu/people/dlm Mike Dean mdean@bbn. com http: //www. daml. org/people/mdean/ Mc. Guinness JCIS June 11, 2005
Background AT&T Bell Labs AI Principles Dept – Description Logics, CLASSIC, explanation, ontology environments – Semantic Search, Find. UR, Collaborative Ontology Building Env – Apps: Configurators, PROSE/Questar, Data Mining, … Stanford Knowledge Systems, Artificial Intelligence Lab – Ontology Evolution Environments (Diagnostics and Merging) Chimaera – Explanation and Trust, Inference Web – Semantic Web Representation and Reasoning Languages, DAMLONT, DAML+OIL, OWL, – Rules and Services: SWRL, OWL-S, Explainable SDS, KSL Wine Agent `Mc. Guinness Associates – Ontology Environments: Sandpiper, Vertical. Net, Cisco… – Knowledge Acquisition and Ontology Building – VSTO, Ge. ON, Im. Ep, … – Applications: GM: Search, etc. ; CISCO : meta data org, etc. ; – Boards: Cerebra, Sandpiper, Buildfolio, Tumri, Katalytik Mc. Guinness JCIS June 11, 2005
Semantic Web Layering Mc. Guinness JCIS June 11, 2005 From: Berners-Lee XML 2000
- Slides: 53