PREMIS Data Dictionary and the Future of Preservation

  • Slides: 11
Download presentation
PREMIS Data Dictionary and the Future of Preservation Metadata Brian Lavoie Research Scientist OCLC

PREMIS Data Dictionary and the Future of Preservation Metadata Brian Lavoie Research Scientist OCLC Research lavoie@oclc. org Society of American Archivists Washington, D. C. August 5, 2006

Preservation Metadata “Information that supports and documents the digital preservation process” Provenance What IPR

Preservation Metadata “Information that supports and documents the digital preservation process” Provenance What IPR must be observed? Who has had custody/ownership of the digital object? Rights Mgmt. Authenticity PRESERVATION METADATA What is needed to render and use the digital object? Technical Environment Preservation Activity Is the digital object what it purports to be? What has been done to preserve the digital object?

Why is preservation metadata important? § Digital objects are technology-dependent … Complex technical environment

Why is preservation metadata important? § Digital objects are technology-dependent … Complex technical environment between content and user • Means to access and use archived object must be documented • Technical metadata especially important • § Digital objects are mutable … Can be easily altered, impacting look, feel, functionality • Changes to object must be documented/validated • Provenance/authenticity metadata especially important • § Digital objects are bound by intellectual property rights … Preservation often proceeds while copyright still in effect • May constrain preservation activities and access policies • Rights management metadata especially important • § Makes digital objects self-documenting across time

PREMIS Working Group § “Early days” … various preservation metadata element sets released Different

PREMIS Working Group § “Early days” … various preservation metadata element sets released Different scopes, purposes, underlying models/assumptions • No international standard; little consolidation of expertise/best practice • § June 2003: OCLC, RLG sponsored international working group: • PREMIS: Preservation Metadata: Implementation Strategies § Objective: • Define implementable, core preservation metadata, with guidelines/recommendations for management and use § Membership: • > 30 experts from 5 countries, libraries, museums, archives, government agencies, private sector

PREMIS Data Dictionary § May 2005: Data Dictionary for Preservation Metadata: Final Report of

PREMIS Data Dictionary § May 2005: Data Dictionary for Preservation Metadata: Final Report of the PREMIS Working Group § 237 -page report includes: PREMIS Data Dictionary 1. 0 • Context/assumptions, data model, usage examples • § Set of XML schema to support implementation § Data Dictionary: Comprehensive view of information needed to support digital preservation • Based on deep pool of institutional experiences in setting up and managing operational capacity for digital preservation • http: //www. oclc. org/research/projects/pmwg/premis-final. pdf

2005 British Conservation Awards: Digital Preservation Award 2006 Society of American Archivists Preservation Publication

2005 British Conservation Awards: Digital Preservation Award 2006 Society of American Archivists Preservation Publication Award

Some guiding principles … § “Implementable, core, preservation metadata”: “Preservation metadata”: maintain viability, renderability,

Some guiding principles … § “Implementable, core, preservation metadata”: “Preservation metadata”: maintain viability, renderability, understandability, authenticity, identity in a preservation context • “Core”: What most preservation repositories need to know to preserve digital materials over the long-term • “Implementable”: rigorously defined; supported by usage guidelines/recommendations; emphasis on automated workflows • § “Technical neutrality”: Digital archiving system: no assumptions about specific archiving technology, system/DB architectures, preservation strategy • Metadata management: no assumptions about whether metadata is stored locally or in external registry; recorded explicitly or known implicitly; instantiated in one metadata element or multiple elements • Promotes flexibility, applicability in wide range of contexts •

Sample Data Dictionary entry

Sample Data Dictionary entry

PREMIS Maintenance Activity § Web site: Permanent Web presence, hosted by Library of Congress

PREMIS Maintenance Activity § Web site: Permanent Web presence, hosted by Library of Congress • Central destination for PREMIS-related info, announcements, resources • Home of the PREMIS Implementers’ Group (PIG) discussion list • § PREMIS Editorial Committee: Set directions/priorities for PREMIS development • Coordinate future revisions of Data Dictionary and XML schema • Membership: Library of Congress, OCLC, FCLA, National Archives of Scotland, British Library, National Library of Australia, U. of Goettingen, LANL, (two more seats still TBD) • Will convene late August/early September • http: //www. loc. gov/standards/premis/

Current activities § Documenting errata and proposed revisions to Data Dictionary (feedback through PIG

Current activities § Documenting errata and proposed revisions to Data Dictionary (feedback through PIG list) • http: //www. loc. gov/standards/premis/changes. html § PREMIS Implementers’ Registry • http: //www. loc. gov/standards/premis-registry. html § Consultancies: Rights issues for digital preservation (Karen Coyle) • PREMIS implementation guidelines and recommendations (Deborah Woodyard-Robinson) • § PREMIS Workshops: 2 -day tutorial on Data Dictionary and implementation issues • Digital Curation Center PREMIS workshop (July 17 -18 Glasgow) • DLF Forum (Boston, early November) •

Looking to the Future … § Basic questions (“what type”, “how much”) still unsettled

Looking to the Future … § Basic questions (“what type”, “how much”) still unsettled … Digital preservation processes still not fully tested/understood • Hard to judge effectiveness a priori • Important to document and share practical experience • § Workflows for preservation metadata … Tools to support automatic generation of preservation metadata (JHOVE, NLNZ tools) • Tools should support formal metadata schemas (like PREMIS) • Registries (PRONOM, GDFR) • § Harmonization with other initiatives … Integrate PREMIS with other standards, technologies, best practices • E. g. , Z 39. 87, METS • Not just standards, but integrated solutions • § Division of labor … • Efficient strategies for collecting preservation metadata: i. e. , WHO and WHEN (Automatic Exposure project)