The Semantic Web Deborah Mc Guinness Associate Director
The Semantic Web Deborah Mc. Guinness Associate Director and Senior Research Scientist Knowledge Systems Laboratory Stanford University Stanford, CA USA dlm@ksl. stanford. edu http: //www. ksl. stanford. edu/people/dlm
Today: Rich Information Source for Human Manipulation/Interpretation Human
“I know what was input” • Global documents and terms indexed and available for search • Search engine interfaces • Entire documents retrieved according to relevance (instead of answers) • Human input, review, assimilation, integration, action, etc. • Special purpose interfaces required for user friendly applications The web knows what was input but does little interpretation, manipulation, integration, and action
Information Discovery… but not much more • Human intensive (requiring input reformulation and interpretation) • Display intensive (requiring filtering) • Not interoperable • Not agent-operational • Not adaptive • Limited context • Limited service Analogous to a new assistant who is thorough yet lacks common sense, context, and adaptability
Future: Rich Information Source for Agent Manipulation/Interpretation Human Agent
“I know what was meant” • • • Understand term meaning and user background Interoperable (can translate between applications) Programmable (thus agent operational) Explainable (thus maintains context and can adapt) Capable of filtering (thus limiting display and human intervention requirements) • Capable of executing services
Semantic Markup Languages such as DAML+OIL Ontologies (www. daml. org) • Encoding background info • User modeling info • Annotating web pages DAML-enabled • Annotating services web pages thereby limiting needs for human disambiguation input, human interpretation, multiple answer display, translation assistance, agent assistance, adaptivity support, etc. )
The Semantic Web Enables… • The Semantic Web enables… E-commerce solutions • M-commerce • New models of intelligent services • • E-commerce solutions M-commerce Web assistants … New forms of web assistants/agents that act on a human’s behalf requiring less from humans and their communication devices…
Under the covers Meaning needs to be encoded, understood, and reasoned with. -- Ontologies capture meanings of terms and their interrelationships
What is an Ontology? Catalog/ ID Thesauri “narrower term” relation Terms/ glossary Informal is-a Frames General Formal is-a (properties) Logical constraints Formal instance Disjointness, Value Inverse, part. Restrs. of…
Ontologies and importance to E-Commerce Simple ontologies (taxonomies) provide: • Controlled shared vocabulary (search engines, authors, users, databases, programs/agents all speak same language) • Site Organization and Navigation Support • Expectation setting (left side of many web pages) • “Umbrella” Upper Level Structures (for extension) • Browsing support (tagged structures such as Yahoo!) • Search support (query expansion approaches such as Find. UR, e-Cyc) • Sense disambiguation
Ontologies and importance to E-Commerce II • Consistency Checking • Completion • Interoperability Support • Support for validation and verification testing (e. g. http: //ksl. stanford. edu/projects/DAML/chimaera-jtpcardinality-test 1. daml ) • Configuration support • Structured, “surgical” comparative customized search • Generalization/ Specialization • … Foundation for expansion and leverage
A Few Observations about Ontologies – Simple ontologies can be built by non-experts • Verity’s Topic Editor, Collaborative Topic Builder, GFP, Chimaera, Protégé, OIL-ED, etc. – Ontologies can be semi-automatically generated • from crawls of site such as yahoo!, amazon, excite, etc. • Semi-structured sites can provide starting points – Ontologies are exploding (business pull instead of technology push) • most e-commerce sites are using them - My. Simon, Amazon, Yahoo! Shopping, Vertical. Net, etc. • Controlled vocabularies (for the web) abound - SIC codes, UMLS, UN/SPSC, Open Directory (DMOZ), Rosetta Net, SUO • Business interest expanding – ontology directors, business ontologies are becoming more complicated (roles, value restrictions, …), VC firms interested, • DTDs are making more ontology information available • Markup Languages growing XML, RDF, DAML, Rule. ML, xx. ML • “Real” ontologies are becoming more central to applications
Implications and Needs • Ontology Language Syntax and Semantics (DAML+OIL) • Environments for Creation and Maintenance of Ontologies • Training (Conceptual Modeling, reasoning implications, …)
Issues – – – Collaboration among distributed teams Interconnectivity with many systems/standards Analysis and diagnosis Scale Versioning Security Ease of use Diverse training levels /user support Presentation style Lifecycle Extensibility
Chimaera – A Ontology Environment Tool An interactive web-based tool aimed at supporting: • Ontology analysis (correctness, completeness, style, …) • Merging of ontological terms from varied sources • Maintaining ontologies over time • Validation of input • Features: multiple I/O languages, loading and merging into multiple namespaces, collaborative distributed environment support, integrated browsing/editing environment, extensible diagnostic rule language • Used in commercial and academic environments • Available as a hosted service from www-ksl-svc. stanford. edu • Information: www. ksl. stanford. edu/software/chimaera
XML • World Wide Web Consortium (W 3 C) standard • Provides important solution to syntax problem and simple semantics and schemas: <SSN>444 -23 -2656</SSN> • Now we can describe the meaning of words • Many applications of XML appearing: – Geographic Markup Language (GML) – Extensible rights Markup Language (Xr. ML) – Chemical Markup Language (CML) Problem: Limited semantics and ontology
DARPA Agent Markup Language • • Builds on top of XML and RDF Provides rich ontology representation Key starting point for W 3 C Semantic Web activity Future releases will provide logic and rules capabilities Problem: Tools to help create DAML ontologies, markup, and to facilitate access are still emerging
EXAMPLES HTML <html> <head> <TITLE>Fred Jones</TITLE> </head> <body> <H 1>Information About Fred Jones</H 1> <P>Fred Jones is in the U. S. Air Force. He is a Captain stationed at AFRL. </P> </body> </html> DAML XML <person> <name>Fred Jones</name> <employer>U. S. Air Force</employer> <station>AFRL</station> <rank>Captain</rank> </person> <rdf: RDF xmlns: rdf="http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#" xmlns: daml="http: //www. daml. org/2001/03/daml+oil#" xmlns: dod="http: //www. dod. mil/personnel#" xmlns: af="http: //www. af. mil/personnel#" xmlns: afrl="http: //www. rl. af. mil/personnel#" <dod: Officer rdf: ID="fsmith"> <dod: given. Name>Fred</dod: given. Name> <dod: surname>Smith</dod: surname> <dod: service rdf: resource="http: //www. dod. mil/services#Air. Force"/> <af: rank rdf: resource="http: //www. af. mil/personnel#Captain"/> <af: station rdf: resource="http: //www. af. mil/stations#AFRL_Rome"/> <daml: equivalent. To rdf: resource="ssn: 123 -45 -6789"/> </dod: Officer> </rdf: RDF>
DAML Status • DAML ontology language specification released and in use • DAML services language specification draft released • http: //www. daml. org provides public Web site with DAML information • Research and corporate teams are developing DAML tools • Supported by W 3 C in the Semantic Web Activity • Endorsed by companies and interest growing
Trustworthy Web Resources Proof, Logic and Ontology Languages Shared terms/terminology Machine-Machine communication 2010? Resource Description Framework e. Xtensible Markup Language Hyper. Text Transfer Protocol Self-Describing Documents 2000 Foundation of the Current Web 1990 (from Berners-Lee, Hendler; Nature, 4/01)
Discussion/Conclusion • Ontologies are exploding; core of many applications • Business “pull” is driving ontology language tools and languages • New generation applications need more expressive ontologies and more back end reasoning • New generation users (the general public) need more support than previous users of KR&R systems • Scale and distribution of the web force mind shift • Markup languages will revolutionize web applications • Agents can be human proxies enabling new applications and modes of interaction
Some Pointers • Ontologies Come of Age Paper: http: //www. ksl. stanford. edu/people/dlm/papers/ ontologies-come-of-age-abstract. html • Ontologies and Online Commerce Paper: http: //www. ksl. stanford. edu/people/dlm/papers/ ontologies-and-online-commerce-abstract. html • DAML+OIL: http: //www. daml. org/
Extras
What Is An Agent? • Software module • Intended to act as a proxy for you in some way • May be: – Tightly controlled – Autonomous – Mobile
Why Is This Important? • Humans work sequentially • Agents work in parallel and 24 x 7 • Therefore, agents can be a major productivity multiplier
Web Trends • Web is evolving from a provider of documents and images (information retrieval) • To a provider of services • Web service discovery Find me an airline service that offers flights to Singapore • Web service execution Buy me “Harry Potter and the Sorcerer’s Stone” at www. amazon. com • Web service selection, composition and interoperation Make my travel arrangements for my Internet World conference trip • Both retrieval and services lend themselves to agent technologies
Problems • Average Web searches examine only 25% of available information • Web searches return a lot of unwanted information • Information content of the Web doubles approximately every six months • Problem continues to worsen as Web grows
- Slides: 28