UNIMARC and linked data Gordon Dunsire and Mirna

  • Slides: 17
Download presentation
UNIMARC and linked data Gordon Dunsire and Mirna Willer Presented at Session 187 (Advancing

UNIMARC and linked data Gordon Dunsire and Mirna Willer Presented at Session 187 (Advancing UNIMARC: alignment and innovation) of the World Library and Information Congress : 77 th IFLA General Conference and Assembly, 13 -18 August 2011, San Juan, Puerto Rico

Overview • Background • Linked data and the Semantic Web • Methods and issues

Overview • Background • Linked data and the Semantic Web • Methods and issues in representing UNIMARC for the Semantic Web • Recommendations

Background • Representation of IFLA standards for use in the Semantic Web – Work

Background • Representation of IFLA standards for use in the Semantic Web – Work of the FRBR Namespaces project and IFLA Namespaces Task Group – Work of the ISBD/XML Study Group • Included a feasibility study of representation of UNIMARC • Representations allow legacy catalogue records to be published as linked data using RDF • Branding IFLA standards for authority & trust – Semantic Web lets “Anyone say Anything about Any resource”

Linked data and RDF • Resource Description Framework (RDF) • Designed for machine-processing of

Linked data and RDF • Resource Description Framework (RDF) • Designed for machine-processing of metadata at global scale (Semantic Web) – 24/7/365 – Trillions of operations per second • Everything must be dis-ambiguated – Machines are dumb • A simple approach helps! – Machine-readable identifiers

RDF triple • Metadata expressed as “atomic” statements – A simple, single, irreducible statement

RDF triple • Metadata expressed as “atomic” statements – A simple, single, irreducible statement • The title of this book is “Cataloguing is fun!” • Constructed in 3 parts – “Triple” • The title of this book is “Cataloguing is fun!” – Subject of the statement = Subject: This book – Nature of the statement = Predicate: has title – Value of the statement = Object: “Cataloguing is fun!” • This book – has title – “Cataloguing is fun!” – subject – predicate - object

Machine-readable identifiers • Uniform Resource Identifier (URI) – Can be any unique combination of

Machine-readable identifiers • Uniform Resource Identifier (URI) – Can be any unique combination of numbers and letters • No intrinsic meaning; it’s just an identifier • RDF requires the subject and predicate of triple to be URIs – Object can be a URI, or a literal string (“Cataloguing is fun!”) • URIs can be matched by machine to link triples together

UNIMARC element identifiers Element: Number (ISBN) Tag: 010 1 st ind. : b 2

UNIMARC element identifiers Element: Number (ISBN) Tag: 010 1 st ind. : b 2 nd (Unique ind. : bin element Subfield: set) a Coded Information Block: Target audience code 100 bba (Unique in element set) Character position: 17 -19 Target audience vocabulary: children, ages 9 -14 Code: d (Unique in vocabulary)

Vocabularies and Element sets • Controlled terminologies represented as vocabularies • UNIMARC entities, attributes,

Vocabularies and Element sets • Controlled terminologies represented as vocabularies • UNIMARC entities, attributes, and relationships form an element set – Attributes and relationships represented as properties/predicates – Entities represented in RDF as classes • But only 1 entity in UNIMARC-B (Resource) • ISBD already has an equivalent class for Resource

UNIMARC and ISBD properties • Element identifier/URI: unimarcb: P 205 bbb – Label (English):

UNIMARC and ISBD properties • Element identifier/URI: unimarcb: P 205 bbb – Label (English): (has) issue statement • Equivalent ISBD URI: isbd: P 1011 – Label (English): has additional edition statement • The meaning is the same, but the identifiers and labels are different • unimarcb: P 205 bbb same as isbd: P 1011 (in RDF) – Or use isbd: P 1011 instead of unimarcb: P 205 bbb

Translations • The same identifier is used for translated elements (captions, definitions, etc. )

Translations • The same identifier is used for translated elements (captions, definitions, etc. ) and vocabularies (preferred terms, definitions, etc. ) • E. g. Vocabulary of 116 bba 0 = Coded data for graphics: Specific material designation

Graphics SMD translation example • • • Term identifier/URI: namespace/b Notation: b Preferred label

Graphics SMD translation example • • • Term identifier/URI: namespace/b Notation: b Preferred label (English): drawing Preferred label (Italian): disegno Preferred label (Portuguese): desenho Definition (English): An original visual representation (other than a print or painting). . .

Triples from UNIMARC records • Create or obtain URI for the Resource described •

Triples from UNIMARC records • Create or obtain URI for the Resource described • Obtain URI for UNIMARC tag/subfield – Direct from tag/indicators/subfield encoding • Obtain URI of value of subfield, or use a literal value – URI from vocabulary or UNIMARC Authority • Publish triple

Recommendations: Foundation • Approve the method of identifying UNIMARC elements and vocabularies. • Approve

Recommendations: Foundation • Approve the method of identifying UNIMARC elements and vocabularies. • Approve the pattern for namespaces for UNIMARC/B and /A elements and vocabularies. • Decide on initial creation and maintenance of UNIMARC elements and vocabularies in the Open Metadata Registry (OMR). • Decide between re-use of existing ISBD namespaces for UNIMARC/B or representing all UNIMARC/B elements and link to existing ISBD classes and properties as appropriate. 25/05/2021 Dunsire & Willer. UNIMARC and Linked Data, IFLA 2011 San Jose, Puerto Rico 13

Recommendations: Foundation • Investigate further the re-use of existing FRAD/FRBR and FRSAD namespaces or

Recommendations: Foundation • Investigate further the re-use of existing FRAD/FRBR and FRSAD namespaces or representing all UNIMARC/A elements and link to existing FRAD/FRBR/FRSAD classes/subclasses and properties as appropriate. • Investigate further the appropriate classes for UNIMARC/A in relation to UNIMARC/B, FRAD/FRBR and FRSAD. • Support and promote the translation of UNIMARC classes and properties in national languages. 25/05/2021 Dunsire & Willer. UNIMARC and Linked Data, IFLA 2011 San Jose, Puerto Rico 14

Recommendations: Application • Discuss and consider the requirements for Application Profiles for UNIMARC. •

Recommendations: Application • Discuss and consider the requirements for Application Profiles for UNIMARC. • Check and verify the availability of SKOS representations of other external vocabularies used in UNIMARC. • Investigate and verify internal UNIMARC vocabularies for suitable SKOS representations; consider approaching the owners of external vocabularies to liaise on developing SKOS representations. 25/05/2021 Dunsire & Willer. UNIMARC and Linked Data, IFLA 2011 San Jose, Puerto Rico 15

Recommendations: Application • Investigate further the “combinatorial explosion” of UNIMARC properties; determine if some

Recommendations: Application • Investigate further the “combinatorial explosion” of UNIMARC properties; determine if some combinations are invalid and do not require a separate property. • Consider and approve the re-use of aggregated ISBD elements which are represented in RDF using Syntax encoding schemes (SES), which will avoid the need for developing UNIMARC equivalents. • Monitor relevant MARC 21 developments, especially the Bibliographic Framework Transition Initiative recently announcement by the Library of Congress. 25/05/2021 Dunsire & Willer. UNIMARC and Linked Data, IFLA 2011 San Jose, Puerto Rico 16

Thank you • gordon@gordondunsire. com • mwiller@unizd. hr

Thank you • gordon@gordondunsire. com • mwiller@unizd. hr