Granularity in Library Linked Open Data Gordon Dunsire
Granularity in Library Linked Open Data Gordon Dunsire Keynote presentation to Code 4 Lib 2013, 12 -14 Feb 2013, Chicago, USA
Overview
Fractals Self-similar at all levels of granularity Cannot determine level: all levels are equal!
Multi-faceted granularity v. What is described by a bibliographic record? v. Or a single statement? v. What is the level of description? v. How complete is it? v. How detailed is the schema used? v. How dumb? v. Semantic constraints? v. Unconstrained? v. AAA! OWA! Rumsfeld and the white light!
Resource Description Framework – Linked data Triple: This resource has intended audience Juvenile Subject Predicate Object has Granularity? Coarse-grained systems consist of fewer, larger components than fine-grained systems [Wikipedia]
Subject: what is the statement about? Consortium collection coarser Super-Aggregate Focus Component Sub-Component finer RDF map Library collection Digital collection Journals Subjects Access Journal title Journal index Issue Festschrift Article Resource Work Section Graphics Paragraph Markup Word URI RDF/XML Node Page
Predicate: what is the aspect described? coarser Membership category Super-Aggregate Access to resource Aggregate Access to content Focus Suitability rating Component Sub-Component finer Audience and usage Audience of audio-visual material
Possible Audience map (partial) unc: “has note on use or audience” unc: unconstrained version rdfs: sub. Property. Of unc: “Intended audience” isbd: International Standard Bibliographic Description isbd: “has note on use or audience” dct: Dublin Core terms dct: “audience” rdfs: sub. Property. Of rda: Resource Description and Access schema: “audience” m 21: “Target audience” m 21: marc 21 rdf. info rda: “Intended audience” frbrer: “has intended audience” rdfs: sub. Property. Of m 21: “Target audience of …” schema: Schema. org frbrer: Functional Requirements for Bibliographic Records, entity-relationship model
What is the aspect described? coarser Super-Aggregate Focus Component Sub-Component finer Resource record Manifestation record Title and s. o. r Title statement Title of manifestation Title word First word of title
Possible Title semantic map (partial) s. P: rdfs: sub. Property. Of d: rdfs: domain r: rdfs: range s. P dc: “Title” r rdfs: “Literal” dct: “Title” s. P e. P rdaopen: “Title” isbd: “has title” s. P isbd: “has title proper” d d isbd: “Resource” rdagrp 1: “Title (Manifestation)” rdaopen: “Title proper” s. P rdagrp 1: “Title proper (Manifestation)” d rdafrbr: “Manifestation” d
Semantic reasoning: the sub-property ladder Semantic rule: If property 1 sub-property of property 2; Then data triple: Resource property 1 “string” Implies data triple: Resource property 2 “string” dct: title rdfs: sub. Property. Of isbd: “has title proper” dct: “has title” Resource machine entailment isbd: ”Resource” “Physics” coarser dumb-up isbd: “has title proper” finer “Physics”
Data triples from multiple schema ex: 1 ex: 2 ex: 3 ex: 4 frbrer: ”has intended audience” isbd: ”has note on use or audience” rda: ”Intended audience (Work)” m 21: ”Target audience” “Primary school” “For ages 5 -9” “For children aged 7 -” m 21 terms: commonaud#j skos: pref. Label “Juvenile”
Data triples entailed from sub-property map ex: 1 ex: 2 ex: 3 ex: 4 unc: ”has note on use or audience” “Primary school” “For ages 5 -9” “For children aged 7 -” “Juvenile”
Data triples entailed from property domains ex: 1 ex: 2 ex: 3 ”is a” frbrer: ”Work” isbd: ”Resource” rda: ”Work”
What is the aspect described? coarser Super-Aggregate Focus Component Sub-Component finer Creator Author Screenwriter Animation screenwriter Children’s cartoon screenwriter
dc: ”Contributor” ? s marcrel: ”Author” dc: ”Creator” ? s r dct: ”Creator” dct: ”Agent” ? lcsh: ”Screenwriters” rdaroles: ”Creator” d rda: ”Work” d d marcrel: ”Author of screenplay, etc. ” s rdaroles: ”Author (Work)” s rdaroles: ”Screenwriter (Work)” ? r r [rda: ”Agent”] r s: rdfs: sub. Property. Of d: rdfs: domain r: rdfs: range
Machine-generated granularity Full-text indexing: down to word level A very large multilingual ontology with 5. 5 millions of concepts • A widecoverage "encyclopedic dictionary" • Obtained from the automatic integration of Word. Net and Wikipedia • Enriched with automatic translations of its concepts • Connected to the Linguistic Linked Open Data cloud!
User-generated granularity “OK for my kids (7 and 9)” “Too childish for me (age 14)” “Ideal for the child of ambitious parents” “This sucks – for kids only” “Great! Has cool stuff”
KISS Keep it simple, stupid Keep it simple and stupid? The data model is very simple: triples! The (meta)data content is complex Resource discovery is complex The Mandelbrot Set: “an example of a complex structure arising from the application of simple rules” - Wikipedia
AAA Anyone can say anything about any thing Someone will say something about every thing In every conceivable way Linguistically
OWA Open World Assumption: the absence of a statement is not a statement of non-existence “There are knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there also unknowns. There are things we don't know. ” - Donald Rumsfeld Will all the gaps get filled?
- Slides: 23