Andy Powell Eduserv Foundation andy powelleduserv org uk

  • Slides: 19
Download presentation
Andy Powell, Eduserv Foundation andy. powell@eduserv. org. uk www. eduserv. org. uk/foundation June 2006

Andy Powell, Eduserv Foundation andy. powell@eduserv. org. uk www. eduserv. org. uk/foundation June 2006 Eprints Application Profile

Agenda • Welcome and introductions • Issues with current use of simple DC •

Agenda • Welcome and introductions • Issues with current use of simple DC • Functional Requirements • Model • Lunch • Eprints Application Profile • Workplan Eprint Application Profile Meeting - London June 2006

Current issues • what’s the problem with using simple DC to describe eprints? •

Current issues • what’s the problem with using simple DC to describe eprints? • difficult to differentiate ‘works/expressions’ from ‘manifestations/items’ • does dc: identifier identify the work/expression or a particular manifestation/item of the work? – in e. Prints UK guidelines, dc: identifier used to identify ‘work/expression’ and dc: relation used to identify ‘manifestation/item’ – but dc: relation may be used for other resources (e. g. cited works), therefore ambiguity in the metadata record – and guidelines not widely implemented anyway… – therefore difficult for software applications to move reliably from the metadata record to the full text Eprint Application Profile Meeting - London June 2006

Current issues (2) • not possible to determine whether subject terms are taken from

Current issues (2) • not possible to determine whether subject terms are taken from a controlled vocabulary or not (e. g. is ‘Physics’ a free-text keyword or a term taken from Dewey? ). – therefore difficult to base subject-browse interfaces on controlled vocabulary hierarchy • not possible to disambiguate authors with same name or reconcile instances of the same author being given different form of name – therefore difficult to build browse-by-author type interfaces • dates are ambiguous (either because of formatting and/or because type of date is not known) Eprint Application Profile Meeting - London June 2006

functional requirements • support search based on title, author, description, keyword, full text index

functional requirements • support search based on title, author, description, keyword, full text index • support browse by keyword and author • support rich subject browse based on knowledge of controlled vocabulary • support filtering of search results and browse tree by type, publisher, date range, status and version(? ) • display title, author, publisher, keyword, full-text match in search results and browse tree • move reliably from search results and browse tree to available copies, filtered by format • move from search results and browse tree to Open. URL ‘link server’ Eprint Application Profile Meeting - London June 2006

functional requirements (2) • enable capture of metadata about and relationships between different ‘versions’

functional requirements (2) • enable capture of metadata about and relationships between different ‘versions’ of the same eprint • be suitable for use in the context of Open. URLs and Open. URL resolvers • i. e. support navigation/discovery of particular version of an eprint (e. g. most recent version of Author’s Original) and navigation/discovery of most appropriate copy of discovered ‘version’ • be compatible with dc-citation WG recommendations • be compatible with preservation metadata approaches • be compatible with library cataloguing approaches Eprint Application Profile Meeting - London June 2006

Functional assumptions • citations are made between eprint ‘expressions’ (in FRBR terms) • hypertext

Functional assumptions • citations are made between eprint ‘expressions’ (in FRBR terms) • hypertext links tend to be made between eprint ‘items’ (in FRBR terms) • adopting a simple underlying model now may be expedient in the short term but costly to interoperability in the long term • the underlying model need to be as complex as it needs to be, but not more so! • a complex underlying model may be manifest in relatively simple metadata and/or end-user interfaces Eprint Application Profile Meeting - London June 2006

FRBR (1) • FRBR models the bibliographic world using 4 key entities 'Work', 'Expression',

FRBR (1) • FRBR models the bibliographic world using 4 key entities 'Work', 'Expression', 'Manifestation' and 'Item'. – A work is a distinct intellectual or artistic creation. A work is an abstract entity – An expression is the intellectual or artistic realization of a work in the form of alpha-numeric, musical, or choreographic notation, sound, image, object, movement, etc. , or any combination of such forms. An expression is the specific intellectual or artistic form that a work takes each time it is "realized. " – A manifestation is the physical embodiment of an expression of a work. The entity defined as manifestation encompasses a wide range of materials, including manuscripts, books, periodicals, maps, posters, sound recordings, films, video recordings, CD-ROMs, multimedia kits, etc. – An item is a single exemplar of a manifestation. The entity defined as item is a concrete entity. Eprint Application Profile Meeting - London June 2006

FRBR (2) • FRBR also defines a set of additional entities that are related

FRBR (2) • FRBR also defines a set of additional entities that are related to the four entities above - 'Person', 'Corporate body', 'Concept', 'Object', 'Event' and 'Place' - and a set of relationships between each of the entities. • the key entity-relations appear to be: – Work -- is realized through --> Expression – Expression -- is embodied in --> Manifestation – Manifestation -- is exemplified by --> Item – Work -- is created by --> Person or Corporate Body – Manifestation -- is produced by --> Person or Corporate Body – Expression -- has a translation --> Expression – Expression -- has a revision --> Expression – Manifestation -- has an alternative --> Manifestation Eprint Application Profile Meeting - London June 2006

FRBR (3) • Simple metadata standards like Dublin Core have traditionally tended to model

FRBR (3) • Simple metadata standards like Dublin Core have traditionally tended to model the resources being described in a rather flat way - for example, as a set of relatively unrelated 'document-like objects‘ • this approach may be sufficient in the context of describing Web pages, it is rather limited in those cases, like scholarly publications, where things being described are more complex. For example, a typical eprint (the publisher's PDF file that is deposited in an eprint archive) is a single item that is an exemplar of a particular manifestation (the PDF manifestation) of a particular expression (the published version) of a work (the conceptual work that is the eprint). There may be other items that are exemplars of the same manifestation (the PDF file as served from the publisher's Web site for example), other manifestations of the saame expression (the HTML manifestation), and other expressions of the same work (the pre-print for example), and so on. Eprint Application Profile Meeting - London June 2006

Model • based on FRBR • but some of the labels have been changed

Model • based on FRBR • but some of the labels have been changed • intention is to make things more intuitive • but may not have succeeded! Eprint Application Profile Meeting - London June 2006

Eprints model is. Authored. By Eprint 0. . ∞ is. Expressed. As 0. .

Eprints model is. Authored. By Eprint 0. . ∞ is. Expressed. As 0. . ∞ Agent 0. . ∞ Version is. Published. By is. Manifested. As 0. . ∞ Format is. Available. As 0. . ∞ Eprint Application Profile Meeting - London Copy June 2006

Eprints model and FRBR Work is. Authored. By Eprint 0. . ∞ is. Expressed.

Eprints model and FRBR Work is. Authored. By Eprint 0. . ∞ is. Expressed. As 0. . ∞ Agent 0. . ∞ Version is. Published. By FRBR Item is. Manifested. As 0. . ∞ Format FRBR Expression is. Available. As 0. . ∞ Copy FRBR Manifestation Eprint Application Profile Meeting - London June 2006

Eprints model and FRBR the eprint (an abstract concept) the author or the publisher

Eprints model and FRBR the eprint (an abstract concept) the author or the publisher is. Authored. By Eprint 0. . ∞ is. Expressed. As 0. . ∞ the ‘version of record’ or the ‘french version’ or ‘version 2. 1’ Agent 0. . ∞ Version is. Published. By is. Manifested. As 0. . ∞ Format the publisher’s copy of the PDF … is. Available. As 0. . ∞ the PDF format of the version of record Eprint Application Profile Meeting - London Copy June 2006

FRBR for eprints Here we are using FRBR to model eprints. A work is

FRBR for eprints Here we are using FRBR to model eprints. A work is “a distinct intellectual or artistic creation”. An expression is “the intellectual or artistic realization of a work in the form of alphanumeric … notation …”. A manifestation is “the physical [or digital] embodiment of an expression of a work”. Finally, an item is “a single exemplar of a manifestation”. Note that “Author’s Original” and “Version of Record” (used below) are taken from the ALPSP/NISO ‘status’ vocabulary at http: //www. niso. org/committees/Journal_versioning/Termsand. Definitionsdraft 2006. pdf eprint (work) The eprint – an abstract work Author’s Original 1. 0 Author’s Original 1. 1 … Version of Record (English) html publisher’s copy Eprint Application Profile Meeting - London Version of Record (French) pdf institutional repository copy version (expression) format (manifestation) copy (item) Note 1: different languages modelled as versions as per FRBR sect 5. 3. 2 June 2006 Note 2: orange parts used as basis for examples later…

Vertical vs. horizontal relationships Eprint is. Expressed. As has. Version is. Manifested. As has.

Vertical vs. horizontal relationships Eprint is. Expressed. As has. Version is. Manifested. As has. Format Eprint Application Profile Meeting - London Format June 2006

Vertical vs. horizontal relationships (2) Eprint is. Expressed. As Version is. Manifested. As Format

Vertical vs. horizontal relationships (2) Eprint is. Expressed. As Version is. Manifested. As Format Eprint Application Profile Meeting - London Version is. Manifested. As Format has. Version and has. Format relationships inferred by following vertical relations June 2006

Attributes Eprint: Version: title date issued subject status abstract version number Agent: identifier (URI)

Attributes Eprint: Version: title date issued subject status abstract version number Agent: identifier (URI) language name creator type is expressed as copyright date of birth identifier (URI) Format: affiliation Open. URL or format mailbox citation (string) date modified homepage identifier (URI) is manifested as identifier (URI) Copy: publisher identifier (URI) is available as (URI) Eprint Application Profile Meeting - London June 2006

Attributes Eprint: Version: title date issued subject status abstract version number identifier (URI) language

Attributes Eprint: Version: title date issued subject status abstract version number identifier (URI) language creator type is expressed as rights Open. URL or Format: citation (string) format is manifested as date modified publisher is available as (URI) Eprint Application Profile Meeting - London Agent: name type date of birth affiliation mailbox homepage June 2006