Accessing Cultural Heritage using Semantic Web Techniques Antoine
Accessing Cultural Heritage using Semantic Web Techniques Antoine ISAAC VU Amsterdam - KB Digital Access to Cultural Heritage Master March 20 th, 2008
Accessing CH using Semantic Web techniques Background • CATCH (NWO) • Continuous Access To Cultural Heritage • Computer science research projects • Applied to Cultural Heritage • STITCH • Seman. Tic Interoperability To access Cultural Heritage • Exchanging and integrating metadata
Accessing CH using Semantic Web techniques Agenda • • • Cultural Heritage interoperability problems Why Semantic Web techniques can be relevant Porting CH vocabularies to the Semantic Web Vocabulary alignment Demo?
Accessing CH using Semantic Web techniques The Interoperability Problem in Cultural Heritage • Trend: simultaneous access to different collections • The European Library, Memory of the Netherlands • Problem: how to access seamlessly different collections? • Traditional solution: using object metadata • For instance subjects coming from controlled vocabularies • But…
Accessing CH using Semantic Web techniques Interoperability Problems From syntactic to semantic • Different formats • Different metadata schemas • Different conceptual vocabularies
Accessing CH using Semantic Web techniques Interoperability Solutions? From syntactic to semantic • Different formats • “We have a solution!” • XML as a standard for data exchange • Different metadata schemas • “Something could be used…” • Dublin Core for simple metadata publication & exchange • Different conceptual vocabularies
Accessing CH using Semantic Web techniques Interoperability Solutions? From syntactic to semantic (continued) • Different conceptual vocabularies • “Do you really want to discuss it now? ” • No standard vocabulary • DDC, UDC, SWD, LCSH, AAT, Iconclass • and myriads of others… • Not even a common model: classes, terms, concepts… • Even worse: there are reasons for this!
Accessing CH using Semantic Web techniques KB Illustrated Manuscripts
Accessing CH using Semantic Web techniques KB Illustrated Manuscripts: Iconclass
Accessing CH using Semantic Web techniques Mandragore
Accessing CH using Semantic Web techniques Mandragore
Accessing CH using Semantic Web techniques What we have
Accessing CH using Semantic Web techniques What we want
Accessing CH using Semantic Web techniques CH Interoperability Problem. S
Accessing CH using Semantic Web techniques Agenda • • • Cultural Heritage interoperability problems Why Semantic Web techniques can be relevant Porting CH vocabularies to the Semantic Web Vocabulary alignment Demo?
Accessing CH using Semantic Web techniques What is the Semantic Web? • Pushed by the World Wide Web Consortium http: //www. w 3. org/2001/sw/ • “The Semantic Web is a web of data” • “It is about common formats for integration and combination of data drawn from diverse sources”
Accessing CH using Semantic Web techniques SW Problem: The Web for Humans • A city • A flag • The city’s location Meaning
Accessing CH using Semantic Web techniques SW Problem: The Web for Computers? Where is meaning?
Accessing CH using Semantic Web techniques SW Problem: The Web for Computers?
Accessing CH using Semantic Web techniques The Semantic Web Approach: A Web of (Meta)data Article The_Netherlands sub. Class. Of type has. Capital file 1 Amsterdam part. Of type defines City paragraph 3 Document
Accessing CH using Semantic Web techniques The Semantic Web (1/2) • Pointing at resources • What? Knowledge objects • everything that we may want to refer to • including documents, persons… • How? Uniform Resource Identifiers • HTTP URLs: http: //www. few. vu. nl/~aisaac/ • urn: isbn: 0 -395 -36341 -1 • mailto: aisaac@few. vu. nl
Accessing CH using Semantic Web techniques A Web of Resources my. Voc 1: Article http: //ex. org/files/file 1 my. Voc 2: Amsterdam http: //www. ned. nl/rep 321
Accessing CH using Semantic Web techniques The Semantic Web (2/2) • Pointing at resources: URIs • Creating structured assertions involving resources • What? Typed links between resources • How? RDF (Resource Description Framework) • Statements subject-predicate-object
Accessing CH using Semantic Web techniques Data in an RDF “Graph” my. Voc 1: Article rdf: type http: //ex. org/files/file 1 my. Voc 1: defines my. Voc 2: Amsterdam http: //www. ned. nl/rep 321
Accessing CH using Semantic Web techniques Building on Top of the Web • Web-based resources allow distribution/sharing of • document • description vocabularies • (meta)data http: //www. geo. org/voc/ (par 3, defines, Amsterdam) http: //www. kb. nl/e. Depot http: //www. ned. nl/rep 321 different owners & locations
Accessing CH using Semantic Web techniques CH Interoperability Problem. S (reminder)
Accessing CH using Semantic Web techniques What do we need then? • Porting vocabularies to the Semantic Web xxx x xxxx xxx xxx xxxx xxxx
Accessing CH using Semantic Web techniques What do we need then? • Aligning vocabularies xxx x xxxx xxx xxx xxxx xxxx
Accessing CH using Semantic Web techniques Agenda • • • Cultural Heritage interoperability problems Why Semantic Web techniques can be relevant Porting CH vocabularies to the Semantic Web Vocabulary alignment Demo?
Accessing CH using Semantic Web techniques SKOS (Simple Knowledge Organization System) • Model to represent KOSs on the SW • In a simple way • In a standard way • Comparable to Dublin Core, for conceptual vocabularies • Still being elaborated by W 3 C http: //www. w 3. org/2004/02/skos/
Accessing CH using Semantic Web techniques SKOS (Simple Knowledge Organization System) • SKOS offers building blocks to represent KOSs in RDF • Objects: Concept and Concept. Scheme • Lexical properties (multilingual) • pref. Label • alt. Label • Semantic relations • broader, narrower • related • Notes • scope. Note • definition …
Accessing CH using Semantic Web techniques SKOS: Example skos: Concept. Scheme rdf: type skos: Concept http: //www. iconclass. nl/ rdf: type skos: in. Scheme http: //www. iconclass. nl/s_11 F skos: pref. Label “the Virgin Mary”@en “la Vierge Marie”@fr skos: pref. Label skos: broader http: //www. iconclass. nl/s_11
Accessing CH using Semantic Web techniques Agenda • • • Cultural Heritage interoperability problems Why can Semantic Web techniques be relevant Porting CH vocabularies to the Semantic Web Vocabulary alignment Demo
Accessing CH using Semantic Web techniques The semantic interoperability problem • There is no standard vocabulary • We don’t really want it different vocabularies for different expertise domains, traditions, tasks • Consequence: • “klassieke ruïnes” vs. “landschap met ruïnes” • “maagd Maria” vs. “Heilige Moeder”
Accessing CH using Semantic Web techniques Vocabulary alignment • Aim: finding semantic correspondences between vocabulary elements • “klassieke ruïnes” ≈ “landschap met ruïnes” • “maagd Maria” = “Heilige Moeder” • Doing it (semi-) automatically • Vocabularies are big (tens of thousands concepts) • They change
Accessing CH using Semantic Web techniques Automatic alignment techniques • Lexical Long brain Labels of entities and textual definitions • Structural Structure of the vocabularies • Background knowledge Using a shared conceptual reference to find links • Extensional Object information (e. g. book indexing) tumor Long tumor
Accessing CH using Semantic Web techniques Automatic alignment techniques • Lexical Long brain Labels of entities and textual definitions • Structural Structure of the vocabularies • Background knowledge Using a shared conceptual reference to find links • Extensional Object information (e. g. book indexing) tumor Long tumor
Accessing CH using Semantic Web techniques Extensional Statistical Alignment • Object information (e. g. book indexing) Thesaurus 1 “Dutch Literature” “Dutch” Collection of books Thesaurus 2
Accessing CH using Semantic Web techniques Results 1: 9132. 9 (1704 3479 976) Schilderijen schilderkunst 2: 8088. 5 (1204 2330 767) Kwaliteitszorg kwaliteitsmanagement 3: 6232. 7 (820 1572 543) Personeelsmanagement personeelsbeleid 4: 5392. 1 (1399 3271 622) Beeldende kunsten beeldende kunst 5: 5063. 1 (4951 1152 613) Nederlands - Nederlandse taalkunde 17: 3421. 8 (280 714 243) Diabetes mellitus suikerziekte
Accessing CH using Semantic Web techniques Alignment: no Trivial Solution • Current techniques are not reliable as unique source of knowledge • Workflow would imply checking/completion by human • Combination of techniques is required • Alignment is a difficult research problem
Accessing CH using Semantic Web techniques Agenda • • • Cultural Heritage interoperability problems Why can Semantic Web techniques be relevant Porting CH vocabularies to the Semantic Web Vocabulary alignment Demo?
Accessing CH using Semantic Web techniques Demo • KB Illuminated Manuscripts • BNF Mandragore Manuscripts • http: //galjas. cs. vu. nl: 33333/MANDRA-SV-ICEmandra. New. NONE , amphibians • Wheat
Accessing CH using Semantic Web techniques Message Semantic Web techniques • • Representation of collections and vocabularies Alignment of vocabularies can help solving Cultural Heritage problems • • Semantic integration Publication and access • [And more: semantic query expansion, clustering…]
Accessing CH using Semantic Web techniques Thanks!
Accessing CH using Semantic Web techniques Links • Semantic Web at W 3 C • http: //www. w 3. org/2001/sw/ • SKOS • http: //www. w 3. org/2004/02/skos/ • Cultural Heritage and Semantic Web projects • • Museum. Finland, http: //www. museosuomi. fi/ e. Culture, http: //e-culture. multimedian. nl/ STITCH, http: //www. cs. vu. nl/STITCH/ CATCH, http: //www. nwo. nl/catch
- Slides: 45