RDF OWL SPARQL and the Semantic Wed ACCU









































































- Slides: 73
RDF, OWL, SPARQL and the Semantic Wed ACCU 2009 Seb Rose
A Quick Recap • The internet has been around for almost 20 years • In the beginning the was mostly static content • Over the past 10 years there has been a move to more dynamic content • Usage is getting faster and better, but is still often frustrating
Where Is The Meaning? • Web technologies are biased toward presentation rather than semantic content • Even when the data comes from structures sources we need to use custom presentation technologies to render it • XML Schemas & XSLT can be used for data integration, but are fragile
Semantic Web • Introduced by Tim Berners Lee in 2001 • Supports distributed web at level of data (rather than presentation) • Permits machine agents to reason about content • Retains the open nature of the web
AAA • Anyone can say Anything about Any topic. • [coined by Allemang/Hendler in their book] • No waiting for authorities to agree on schema, leading to … • Network effect of gradual “semantification” of web, requiring … • Ability to filter facts based on provenance
Where Will It Go? • Semantic content may live: - at dedicated web addresses - interwoven in existing web pages • Presentation may be generated from semantic content • Existing browsers will either ignore the content or render it literally
RDF • Resource Definition Framework • Introduced early in W 3 C process • Expresses facts (assertions) as triples: Subject Predicate Object • E. g. Tony Works. For Oracle
Thinking About RDF • Tabular Representation: Row – Subject Column – Predicate Cell Value – Object • Directed Graph: Node – Subject Directed Arc – Predicate Node - Object
Resources • Subjects, Predicates and Objects all name Resources • Resources are identified by URIs • If a URI is dereferencable it is also a URL… but it doesn’t have to be a URL • Objects can be literal (XML) values
What Does RDF Look Like? • RDF has several standard serialisations • Often stored as RDF-XML, but this is not very readable • Most simply expressed as raw triples, but this is very verbose • Most often consumed by practitioners as N 3
Namespaces • URIs tend to live in namespaces • These get given symbolic names to make serialisations more compact • There are various ‘standard’ namespaces: rdf, rdfs, owl, dc • You can specify a default namespace for an RDF document • Abbreviated URIs are called qnames
Ontology • A semantic model – a schema • Each namespace refers to an ontology • Dublin Core (named after a town in US) is the most common with hundreds of useful annotations • There are many domain specific ontologies (and there are search tools to help you find an applicable one)
N 3 - Example @prefix comp: <http: //www. claysnow. co. uk/ACCU 2009/Example. rdf#> @prefix rdf: <http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#> comp: Accu 2009 rdf: type comp: Computer. Conference. comp: Accu 2009 comp: start. Date “April 22, 2009”. comp: Accu 2009 comp: has. Keynote comp: Bob. Martin. comp: Accu 2009 comp: has. Keynote comp: Allan. Kelly. OR comp: Accu 2009 rdf: type comp: Computer. Conference ; comp: start. Date “April 22, 2009”; comp: has. Keynote comp: Bob. Martin , comp: Allan. Kelly.
N 3 – Example (2) • Abbreviate rdf: type to “a”: comp: Accu 2009 a comp: Computer. Conference ; comp: start. Date “April 22, 2009”; comp: has. Keynote comp: Bob. Martin , comp: Allan. Kelly.
Blank Nodes • Aka BNODES • Useful when making statements about entities that don’t have an identifier • Can be interpreted as “there exists” • Shown in square braces: comp: Bob. Martin comp: presented. Key. Note. At [ a comp: Conference ; comp: located. At comp: Barcelo. Oxford ].
Lists • Verbose in RDF, but have compact representation in N 3: comp: Accu 2009 comp: scheduled. Breaks (comp: Morning. Coffee comp: Lunch comp: Afternoon. Tea). • Becomes : comp: Accu 2009 comp: scheduled. Break _: a rdf: first comp: Morning. Coffee. _: a rdf: rest _: b rdf: first comp: Lunch. _: b rdf: rest _: c rdf: first comp: Afternoon. Tea. _: c rdf: rest rdf: nil.
Explicit Reification • Used for making statements about statements: “Giovanni told me ACCU 2010 will be in Hawaii” : r rdf: subject comp: Accu 2010 ; rdf: predicate comp: located. At ; rdf: object geo: Hawii. comp: Giovanni : has. Said : r. • By asserting these reification triples, we have not asserted the triple itself.
More RDF • • rdf: Property – the class of RDF properties rdf: Statement – the class of RDF statements rdf: resource – used in RDF-XML rdf: about – used in RDF-XML • However, almost all uses of RDF also use RDFS (even the RDF resource itself)
SPARQL • • W 3 C standard query language Based on triple patterns Variables denoted by prefixing with ‘? ’ Graph patterns are lists of triple patterns enclosed in curly braces {} • Responses in tabular format: SELECT • Responses in graph format: CONSTRUCT
SELECT { ? grandfather ? granddaughter } WHERE { ? grandfather : has. Child ? parent : has. Daughter ? granddaughter. }
UNION SELECT { ? grandfather ? granddaughter } WHERE {{ ? grandfather : has. Daughter ? mother : has. Daughter ? granddaughter. } UNION { ? grandfather : has. Son ? father : has. Daughter ? granddaughter. }}
RDFS • Resource Definition Framework Schema • Expressed in RDF • Extends RDF by introducing a set of distinguished resources • RDF creates a graph structure, while RDFS models sets • RDFS expresses ‘meaning’ through mechanism of inference.
Inference • Each RDFS construct is defined by the inferences that can be made when it is used • Inferencing is done by some part of the system and results in inferred triples • When this happens and what happens to the triples is not specified • Typically asserted and inferred triples are indistinguishable • Inferred triples are also used to make further inferences
Inference (2) • Inferencing is the ‘glue’ that is used to join different schemas • The RDFS (and OWL) patterns that follow enable us to specify how schemas should be joined (and the data merged) • The resulting data can then be queried as if it were a single unified triple set
Inferencing Assumptions • Non Unique Naming – a single, real-world entity may have multiple URI assigned to it. Inferencing engines cannot assume that different URIs refer to different entities. • Open World Assumption – inferencing engines cannot assume they have seen all relevant assertions. More can become available at any time.
Type Propagation • We have already seen: comp: Accu 2009 rdf: type comp: Computer. Conference. • In the comp namespace we would find: comp: Computer. Conference rdf: type rdfs: Class. comp: Computer. Conference rdfs: sub. Class. Of comp: Conference. • Hence, we can infer: comp: Accu 2009 rdf: type comp: Conference.
Relationship Propagation • We have already seen: comp: Accu. Conference comp: has. Keynote comp: Bob. Martin. • In the comp namespace we would find: comp: has. Keynote rdf: type rdf: Property. comp: has. Speaker rdf: type rdf: Property. comp: has. Keynote rdfs: sub. Property. Of comp: has. Speaker. • Hence, we can infer: comp: Accu. Conference comp: has. Speaker comp: Bob. Martin.
Typing a Property • Given a triple, S P O, we can describe the usage of the predicate: comp: has. Speaker rdfs: domain comp: Person. comp: has. Speaker rdfs: range comp: Conference. • There are no invalid assertions in RDFS, so: comp: Bob. Martin comp: has. Speaker comp: Accu. Conference. • Would allow us to infer: comp: Bob. Martin rdf: type comp: Conference. comp: Accu. Conference rdf: type comp: Person.
Unexpected Interaction • Recall: comp: Computer. Conference rdf: type rdfs: Class. comp: Computer. Conference rdfs: sub. Class. Of comp: Conference. • We now add: comp: geek. Ratio rdf: type rdf: Property. comp: geek. Ratio rdf: domain comp: Computer. Conference. • Now whenever we see: A comp: geek. Ratio B We can infer: A rdf: type comp: Conference
Unexpected Interaction (2) • rdfs: domain is not the same as declaring a property on a class in OO modelling • In RDFS, properties are defined independently of classes • RDFS relations (domain, range) tell us what inferences can be made from asserted triples
Some Simple Patterns • RDFS has a limited number of simple inference rules • These can combine in subtle ways to provide useful inferences • The following patterns simulate some aspects of modelling set manipulations
Set Intersection • Given: comp: Programmer rdf: type rdfs: Class. comp: Contract. Staff rdf: type rdfs: Class. • We can model this in one direction: comp: Contract. Programmer rdfs: sub. Class. Of comp: Programmer. comp: Contract. Programmer rdfs: sub. Class. Of comp: Contract. Staff. • Now by asserting that: comp: Verity. Stobb rdf: type comp: Contract. Programmer. • We can infer that: comp: Verity. Stobb rdf: type comp: Programmer. comp: Verity. Stobb rdf: type comp: Contract. Staff.
Set Union • Given: comp: Programmer rdf: type rdfs: Class. comp: Tester rdf: type rdfs: Class. • We can model this in one direction: comp: Programmer rdfs: sub. Class. Of comp: IT_Staff. comp: Tester rdfs: sub. Class. Of comp: IT_Staff. • Now by asserting either: comp: Verity. Stobb rdf: type comp: Programmer. comp: Verity. Stobb rdf: type comp: Tester. • We can infer that: comp: Verity. Stobb rdf: type comp: IT_Type.
Properties • Similar constructs can be used to approximate intersection and union of properties • Property Transfer can be used to join data from different schemas. Given: my. Schema: P 1 rdfs: sub. Property. Of your. Schema: P 2. Then: my. Schema: S my. Schema: P 1 my. Schema: O. Entails: my. Schema: S your. Schema: P 2 my. Schema: O.
Non-modeling RDFS • rdfs: label – provides a human readable label (for external display) • rdfs: comment – inline human readable documentation • rdfs: see. Also – a URI for additional documentary resources • rdfs: is. Defined. By – a URI for resource definitions to be specified
RDFS-Plus • RDFS extended with a subset of OWL (specified by Allemang/Hendler) • Inferencing is expensive • RDFS is not quite expressive enough • RDFS-Plus trades expressivity for performance and is useful in many real world situations
Inverse Properties • owl: inverse. Of Given: A P B. P rdf: type rdf: Property. Q rdf: type rdf: Property. P owl: inverse. Of Q. Infer: B Q A • But, given: We can’t infer: AQB. BPA.
Property Properties • owl: Symmetric. Property Given: A P B. P rdf: type owl: Symmetric. Property. Infer: B P A • owl: Transitive. Property Given: A P B. Infer: B P C. P rdf: type owl: Transitive. Property. A P C.
Sameness of Types • owl: equivalent. Class Given: A owl: equivalent. Class B. x rdf: type A. Infer: x rdf: type B. • owl: equivalent. Property Given: P owl: equivalent. Property Q. APB. Infer: AQB.
Sameness of Individuals • owl: same. As Given: A owl: same. As B. APC. DPA. Infer: BPC. DPB. • This identifies equivalent ‘individuals’, while equivalent. Class identifies equivalent types.
Sameness of Individuals (2) • owl: Functional. Property Given: P rdf: type owl: Functional. Property. APB. APC. Infer: B owl: same. As C. • This is an important class, since it allows sameness to be inferred
Sameness of Individuals (3) • owl: Inverse. Functional. Property Given: P rdf: type owl: Inverse. Functional. Property. APB. CPB. Infer: A owl: same. As B. • This resource can be thought of as analogous to primary key in relational databases.
Property Classification • owl: Datatype. Property For properties whose object is a literal value e. g. comp: start. Date rdf: type owl: Datatype. Property • owl: Object. Property For properties whose object is a resource e. g. comp: has. Speaker rdf: type owl: Object. Property • Used to assist tool support, not semantics
Utility of RDF-Plus • • Can model a wide variety of relationships Can infer sameness of individuals and types Is relatively cheap to implement However, further OWL constructs allow us greater possibilities: - Classification by restriction - Full set manipulation - Cardinality
Restriction comp: Oxford. Conferences owl: equivalent. Class [ a owl: Restriction; owl: on. Property comp: located. At ; owl: some. Values. From comp: Oxford. Venues ]. • owl: Restriction rdfs: sub. Class. Of owl: Class. • Defined by a description of its members in terms of existing properties and classes.
On Property comp: Oxford. Conferences owl: equivalent. Class [ a owl: Restriction; owl: on. Property comp: located. At ; owl: some. Values. From comp: Oxford. Venues ]. • Specifies which property is used to define the restriction class
Some Values From comp: Oxford. Conferences owl: equivalent. Class [ a owl: Restriction; owl: on. Property comp: located. At ; owl: some. Values. From comp: Oxford. Venues ]. • At least one value of the property comes from the specified class
All Values From comp: Oxford. Conferences owl: equivalent. Class [ a owl: Restriction; owl: on. Property comp: located. At ; owl: all. Values. From comp: Oxford. Venues ]. • All values of the property comes from the specified class • This includes the empty set
Has Value comp: Barcelo. Oxford. Conferences owl: equivalent. Class [ a owl: Restriction; owl: on. Property comp: located. At ; owl: has. Value geo: Barcelo. Oxford ]. • The value of the property is as specified • Special case of owl: some. Values. From
comp: Programmer a owl: Class. comp: Development. Tool a owl: Class. comp: develops. With a rdf: Property ; rdfs: domain comp: Programmer ; rdfs: range comp: Development. Tool. comp: Old. School. Tool owl: sub. Class. Of comp: Development. Tool. comp: Old. School. Coder owl: sub. Class. Of [ a owl: Restriction ; owl: on. Property comp: develops. With ; owl: all. Values. From comp: Old. School. Tool ]. Now if we assert: comp: Alan. Lenton a comp: Old. School. Coder ; comp: develops. With comp: Vim. What can we inference can we make?
Protegé • Open source ontology editing tool • Version 3. 4 supports OWL 1. 0, SPARQL and integrates with reasoners • Uses slightly archaic terminology for restriction description: - necessary (or partial definition) - necessary and sufficient (or complete definition)
Set Operations • These use the same syntax we saw for list constructs: C a owl: Class ; owl: union. Of( A, B ). C a owl: Class ; owl: intersection. Of( A, B, C ).
Assumptions Revisited • With set operations we would also like to express cardinality constraints • Open World and Non Unique Name Assumptions make this impossible • We can expressly turn off these assumptions where necessary
Explicit Set Membership • This ‘Closes the World’: ed: Oxbridge. Universities a owl: Class ; owl: one of ( ed: Oxford. Uni, ed: Cambridge. Uni ). • But we also need to limit the Non Unique Name Assumption: ed: Oxford. Uni owl: different. From ed: Cambridge. Uni.
And for larger sets … : Coloured. Balls owl: equivalent. Class [ a owl: all. Different ; owl: distinct. Members ( : Black. Ball, : Pink. Ball, : Blue. Ball, : Green. Ball, : Brown. Ball, : Yellow. Ball, : Red. Ball ) ].
Cardinality Restrictions • [ a owl: Restriction ; owl: on. Property cards: bridge. Player ; owl: cardinality 4 ] • [ a owl: Restriction ; owl: on. Property cards: monopoly. Player ; owl: min. Cardinality 2 ; owl: max. Cardinality 6 ]
Set Complement • A owl: complement. Of B • Very dangerous, because A now contains everything (in the universe) not in A • Usually used with intersections: comp: Non. Groovy. Programmers owl: intersection. Of ( [ a owl: Class ; owl: complement. Of comp: Groovy. Programmers ] , comp: Programmers ).
Disjoint Sets • • A owl: disjoint. With B No member of A can be a member of B Used to infer difference There is no All. Disjoint construct for classes that is analogous to the owl: All. Different construct for individuals
Contradictions • In RDFS there could be no contradictions • OWL constructs allow us to make contradictory statements: : Wayne. County a : Man. : Jayne. County a : Woman. : Wayne. County owl: same. As : Jayne. County. : Man owl: disjoint. With : Woman. • OWL can tell is there is a contradiction, but cannot tell us which assertion is ‘wrong’
Unsatisfiable Classes • Since we can make contradictory statements, it follows that we can define classes that can’t have any members • Unsatisfiability can be propagated through sub. Class. Of, some. Values. From, intersection, domain and range constructs
Classes and Individuals • Reasoning about classes and individuals is implemented identically • Reasoning about individuals can be thought of as processing data and generating output • Reasoning about classes can be thought of as compilation of the model • Reasoning about classes can be done in the absence of any individuals
Class or Individual • Would like to model separately • Is a Bird an individual in the class of Animals, or a class containing individuals? • Use the Class-Individual Mirror Pattern: : Bird a : Animal. : Birds owl: equivalent. Class [ a owl: Restriction ; owl: on. Property : comprises ; owl: has. Value : Bird ].
Antipatterns • Allemang and Hendler identify 5 antipatterns: - Rampant Classism - Exclusivity - Objectification - Managing Identifiers for Classes - Creeping Conceptualization
Objectification • Model a person with parents: : Person a owl: Class. : has. Parent rdfs: domain : Person. : has. Parent rdfs: range : Person. [ a owl: Restriction ; owl: on. Property : has. Parent ; owl: cardinality 2 ]. • What happens if we assert: : Willem : has. Parent : Beatrix.
Inferences • It’s not a contradiction that we haven’t assert that Willem and Beatrix are Person – it will be inferred • It’s not a contradiction that there is only one has. Parent assertion – Open World • It’s not a contradiction if there were more than 2 has. Parent assertions – Non Unique Naming
Reuse Revisited • • owl: Ontology – specifies URI of ontology owl: imports – includes specified ontology owl: version. Info – human/tool readable owl: prior. Version owl: backward. Compatible. With owl: incompatible. With owl: Deprecated. Class Deprecated. Property
Modeling Philosophy • Provable – a model should be decidable (from Description Logic). This is important in formal systems where every possible inference must be guaranteed. • Executable – inferences must be correct, but not necessarily complete. Like most programming languages, these models are not decidable.
OWL Dialects • OWL Full – unconstrained use of OWL constructs. Not decidable. • OWL DL – constrained use of OWL constructs. Decidable (in finite time) • OWL Lite – a cutdown version of OWL intended to ease implementation • These are defined by the standard, but there are numerous other proprietary subsets.
owl: Class vs. rdfs: Class • owl: Class is defined as a subclass of rdfs: Class • In OWL Lite and OWL DL, owl: Class must be used for all class descriptions • Not all RDFS classes are legal OWL Lite or OWL DL classes • In OWL Full owl: Class and rdfs: Class are equivalent
Usability • There are plenty of Semantic Web systems out there, but • Inferencing can be very slow • Especially in the presence of data updates • Received wisdom is that for large (i. e. useful) datasets, inferencing should be done over the T-Boxes only
OWL 2. 0 and beyond • This presentation has been based on OWL 1. 0 • OWL 2. 0 is in development and there already tools that support the emerging standard • SPARQL-DL is being proposed as a successor to SPARQL
Conclusion • • This technology is still developing There is lots of academic interest Tooling is becoming usable Business is beginning to invest
References • Semantic Web for the Working Ontologist Allemang/Hendler - 978 -0 -12 -373556 -0 • The Semantic Web – Tim Berners-Lee, Jim Hendler, Ora Lassila – Scientific American, May 2001 : http: //www. sciam. com/article. cfm? id=the-semantic-web • www. w 3. org • Protégé Owl Tutorial – Horridge 2004: www. co-ode. org/resources/tutorials/Protege. OWLTutorial. pdf