www medra org Gabriella Scipione Cineca m EDRA
www. medra. org Gabriella Scipione Cineca m. EDRA project: multilingual European DOI (Digital Object Identifier) Registration Agency Kerkira, 23 May 2003
www. medra. org m. EDRA CINECA is the largest scientific comupting centre in Italy and one of the largest in the European Union Point of contact for academia, reasearch and industry
www. medra. org Cineca: Statutory aims m. EDRA w To promote the use of the most advanced information processing systems in order to support public and private scientific and technological research w To provide all the Consortium members, the MIUR (Ministry of Education, University Research) and other Ministries with specialized computer processing services w To extend the use of the available resources to other public and private bodies
www. medra. org Cineca: main activities m. EDRA w High Performance Computing: The Division promotes and supports the utilization of these resources. VIS. I. T. (VISual Information Technology): interdisciplinary laboratory equipped with powerful and supercomputers for graphical treatment. This division enabled Cineca to take part to numerous E-projects w Services for Ministry (MIUR): e-vote system for the election of the members of the judging commission for the university appointment (secret and anonimous online election); service for online enrolment at the university w Value added Internet services for several public institutions: create and maintain several portals, provides for the Foreign Affair Ministry the software infrastructure that allows embassies (worldwide distributed) to exchange documents securely, for the Ministry of Justice offer a free and unified access service to juridical material - national and European – situated on public institution Internet sites. w Administrative services for the Universities
HPC C. I. N. E. C. A. facility
www. medra. org Summary Part I: what the DOI is 1. 2. 3. 4. 5. 6. 7. 8. DOI is a standard identifier DOI: interoperability DOI is an alphanumeric string DOI can integrate other standard identifiers DOI is a tool to describe the identified entity DOI is “actionable” in the Internet DOI is a persistent identifier DOI system is based on a designed social structure Part II: what the m. EDRA project is Part III: concrete applications of DOI • • • Voluntary deposit system Persistent citation system Tracking relations between IP entities m. EDRA
www. medra. org What the DOI is m. EDRA 1. DOI (Digital Objects Identifier) is a new standard for identifying digital objects particularly adopted in the publishing community Applied to any type or format of object: text, music, film, video, photographs, software, database record, certificate… Two aspects: 1. Uniquely Identifies the Object - therefore enables computers to interoperate about it and execute transactions of all kinds. 2. Provides Linking to the Object Itself (or to any related objects, transactions or services). These links are: Permanent
www. medra. org DOI: Interoperability m. EDRA DOI has a similar role for e-content trade like the Bar code has in commerce of tangible goods: it facilitates interoperability between the information systems of parties.
www. medra. org m. EDRA The DOI syntax 2. DOI is an alphanumeric string, divided in two parts: a prefix and a suffix Prefix Suffix 10. abc 123 / def 456 Identify that this string is a DOI Identify the registrant: publisher or imprint, content producer in general Identify the Digital object: it is under the registrant’s responsibility
www. medra. org DOI and other standards m. EDRA 3. DOI can integrate other standard identifiers, creating a common framework • ISBN also applies to electronic publications, but with a scope limited to monographs (under definition in ISO working group for the revision of ISBN) • ISSN (or other systems to identify articles) only applies to continuous resources
www. medra. org The DOI Metadata m. EDRA 4. DOI is a tool to describe the identified entities DOI = string + set of metadata • • • Each DOI has a minimum set of metadata (kernel metadata)… … and additional metadata appropriate for the application profile. These last includes “descriptive metadata” (related to the genre of the resources) and “service and administrative metadata” (related to applications). XML Schemas are used to collect metadata
www. medra. org DOIs are actionable in the Internet m. EDRA 5. DOI is supported by a resolution system that makes DOIs “actionable” in the Internet Users can resolve DOIs to the identified content and/or other resources related to that content. Unlike the URLs, DOIs are associated to documents and not to locations: if a document is moved to a different location, users are readdressed to reach the correct page.
www. medra. org DOI resolution system m. EDRA The underlying technology is the Handle System® produced by CNRI (Corp. National Research Initiatives) • Handle is a resolution system: i. e. a tool to resolve a name to a source of information (typically a URL). It belongs to the n 2 l (urn to url) technologies Handle allows multiple resolution: i. e. a DOI can point to more than one source of information
www. medra. org DOI resolution system m. EDRA • URL(s) are included in metadata • Registrant is technically allowed and must up-date metadata in order to guarantee persistence
www. medra. org Resolution vs Identification m. EDRA Remind that “what the DOI identifies” and “what the DOI resolves to” are two different concepts Resolution 1 Resolution 2 What the DOI The DOI® (Digital Object Identifier) is a standard for identifying any object of intellectual property. A DOI provides a means of persistently identifying a piece of intellectual property on a digital network and associating it with related current data. On digital networks, all intellectual property is simply a string of bits; a DOI can apply to any form of intellectual property in any digital environment. DOIs have been called "the bar code for intellectual property": like the physical bar code, they are enabling tools for use all through the supply chain to add value and save cost. A DOI differs from commonly used internet pointers to material such as the URL – Uniform Resource Locator, the usual means of referring to World Wide Web material – because it identifies an object as a first-class entity, not simply the place where the object is located. A DOI is also different from commonly used identifiers of intellectual property like standard bibliographic and related identifiers (ISBN, ISSN, ISRC, etc) because it is associated with defined services and is immediately "actionable" on a network. However, the DOI does not compete with these standards since it allows them to be integrated as suffixes in DOI strings. A DOI is an implementation of the Internet concepts of Uniform Resource Name and Universal Resource Identifier. A DOI is different from abstract naming specifications such as URN in that it is a defined identification DOI Resolution 3 Resolution 4 Identified entity Informat. on the docume nt How to buy the doc. Abstrac t
www. medra. org Resolution vs Identification m. EDRA It is also possible that DOI does not resolve to the identified entity Resolution 1 What the DOI The DOI® (Digital Object Identifier) is a standard for identifying any object of intellectual property. A DOI provides a means of persistently identifying a piece of intellectual property on a digital network and associating it with related current data. On digital networks, all intellectual property is simply a string of bits; a DOI can apply to any form of intellectual property in any digital environment. DOIs have been called "the bar code for intellectual property": like the physical bar code, they are enabling tools for use all through the supply chain to add value and save cost. A DOI differs from commonly used internet pointers to material such as the URL – Uniform Resource Locator, the usual means of referring to World Wide Web material – because it identifies an object as a first-class entity, not simply the place where the object is located. A DOI is also different from commonly used identifiers of intellectual property like standard bibliographic and related identifiers (ISBN, ISSN, ISRC, etc) because it is associated with defined services and is immediately "actionable" on a network. However, the DOI does not compete with these standards since it allows them to be integrated as suffixes in DOI strings. A DOI is an implementation of the Internet concepts of Uniform Resource Name and Universal Resource Identifier. A DOI is different from abstract naming specifications such as URN in that it is a defined identification DOI Resolution 2 Resolution 3 Identified entity (e. g. a book) Informat. on the docume nt How to buy the doc. Abstrac t
www. medra. org DOI: Persistence m. EDRA 6. DOI is persistent identifier Why a Persistent Identifier? URLs are not sufficiently reliable Identification “the half-life of a referenced URL is approximately four years from its publication date “ (Diomidis Spinellis, The Decay and Failures of Web References, ACM).
URL Content URL URL URL
URL URL URL 404 File not found URL URL Content URL URL
URL DOI Publisher DOI URL DOI Content DOI URL DOI directory DOI URL URL DOI Content DOI URL URLDOI URL
www. medra. org DOI and Preservation m. EDRA The connection between preservation and DOI work lies in interoperability and persistence 1) A distributed virtual archive requires that all the players and components interoperate 2) Preservation: How do we interoperate with the future? Persistence is interoperability with the future
www. medra. org DOI Governance and structure m. EDRA 7. DOI system is based on a designed social structure DOIs, similarly to ISBNs, are registered through Registration Agencies Technology Handle system IDF CNRI RA 2 RA 1 Community of interest 1 m. EDRA Community of interest 2 Set up of a Local Handle System European content industry
www. medra. org What the m. EDRA project is m. EDRA • m. EDRA is a project within the e. Content programme of the European Commission – Action line 2 • Start up: 1 July 2002 – Conclusion: 30 June 2004 Ø Objective: to set up a multiple-application, multi-lingual European DOI Registration Agency • Multiple-application: the existing agencies mainly focus on individual applications • Multi-lingual: the existing agencies (except Enpia) focus on Englishbased content • European: the existing agencies are based in USA (three), Australia, Korea and UK
www. medra. org What the m. EDRA project is m. EDRA Results achieved m. EDRA has been appointed as DOI Registration Agency and this will be effective by July 2003
www. medra. org m. EDRA partnership m. EDRA • AIE – the Italian publishers association (co-ordinator) • MVB – a company of the German publishers association, ISBN agency for German linguistic area • SNE – the French publishers association • Editrain – a Spanish company specialised in services for book-trade • CINECA – the technological provider, a consortium of Italian Universities.
www. medra. org done m. EDRA • Preliminary studies (market survey, metadata study and technological assessment) lasted on 30 Nov. 2002 • Architectural design lasted on 28 Feb. 2003 forthcoming The m. EDRA blueprint • System implementation: by 30 Nov. 2003 a first commercial release of the whole system will be delivered: publishers can join the system since July 2003 • Experimentation involving early adopters – Dec. 2003 - Apr. 2004 • Second release of the system and start up of the commercial phase by 30 June 2004
www. medra. org Application 1: Voluntary deposit system m. EDRA Printed works, thanks to legal deposit, always have a certified publication date. What about digital editions? m. EDRA is studying the possibility to develop a voluntary deposit system using the DOI technology to identify contents (time stamping and digital signature) Aims: • to certificate (using time stamping technologies) that a digital or web content was registered by a party at a certain date. • to create a tool in case of contest on rights ownership or on authorship Certifications can be opposed to third parties
www. medra. org Application 1: Voluntary deposit system m. EDRA The service will be available at the beginning only for text files, both monographs and journal articles, and in a second time also for images (photographs, icons etc. ), software. m. EDRA system has to guarantee that: 1) The content was registered by a party at a certain date. 2) the metadata identify the author
www. medra. org Application 1: Voluntary deposit system w During a deposit request, the user will send to m. EDRA the document identified by a DOI and the metadata. To certify that the deposit took place in a certain date, m. EDRA will apply a timestamp and will sign the document with m. EDRA’s signature. m. EDRA
www. medra. org Application 2: Persistent citation system m. EDRA • The need: to substitute URLs with DOIs to cite Internet resources • This allows to have persistent citations of resources (no more “error 404 -file not found”!) • How to make persistence effective is a task of registration agencies. • It is a very simple application, exploiting one of the key features of DOI, but providing further value added: Ø Check routines to control persistence Ø alert system to avoid broken link Ø Procedures and discipline for publishers
www. medra. org Application 3: Tracking relations between IP entities m. EDRA Aim: take under control the relations between digital objects. • We have defined five kinds of “parent-child” relations between IP entities Ø Ø Ø A is part of B (a chapter of a book…) A is a different product form of B (pdf and doc file of the same content) A is a new edition of B (different in content) A is a different language version of B A is a resource about B (a press release or a cover of a book) • They are registered in metadata so that users (final and intermediaries) can navigate all the relations with a unique tool
www. medra. org m. EDRA Contacts Gabriella Scipione g. scipione@cineca. it Tel +39 -051 -6171634
www. medra. org Application 1: Voluntary deposit system w For legal issues, m. EDRA will require both timestamps and a m. EDRA’s certificate to a valid Certification Authority m. EDRA
www. medra. org Cineca: main activities m. EDRA w VIS. I. T. (VISual Information Technology): inter-disciplinary laboratory equipped with powerful and graphic supercomputers for graphical treatment that enabled Cineca to take part to numerous E-projects
- Slides: 34