Challenges Ahead Moving from Digital Collections to Digital

  • Slides: 68
Download presentation
Challenges Ahead: Moving from Digital Collections to Digital Archives & Libraries Howard Besser NYU

Challenges Ahead: Moving from Digital Collections to Digital Archives & Libraries Howard Besser NYU Moving Image Archiving & Preservation Program http: //besser. tsoa. nyu. edu/howard Besser--Taiwan NDAP 3/5/05 1

Challenges Ahead: Moving from Digital Collections to Digital Archives & Libraries Models for Digital

Challenges Ahead: Moving from Digital Collections to Digital Archives & Libraries Models for Digital Archives & Libraries Importance of Metadata Standards Types and Uses of Metadata Administrative and Structural Metadata: METS and The Making of America II Project Longevity Metadata The 4/99 NISO/DLF Image Metadata Workshop Besser--Taiwan NDAP 3/5/05 2

Key problems we’re facing Discovery Longevity Interoperability- Besser--Taiwan NDAP 3/5/05 3

Key problems we’re facing Discovery Longevity Interoperability- Besser--Taiwan NDAP 3/5/05 3

Serious Longevity Problems What we know from prior widespread digital file formats Images separating

Serious Longevity Problems What we know from prior widespread digital file formats Images separating from their metadata Inaccessibility of software needed to view an image Inability to even decode the file format of an image Besser--Taiwan NDAP 3/5/05 4

Traditional Digital Library Model DL search & presentation DL DL search & presentation user

Traditional Digital Library Model DL search & presentation DL DL search & presentation user Besser--Taiwan NDAP 3/5/05 DL search & presentation user 5

Ideal Digital Library Model DL DL search & presentation user Besser--Taiwan NDAP 3/5/05 user

Ideal Digital Library Model DL DL search & presentation user Besser--Taiwan NDAP 3/5/05 user 6

From Digital Collections to Digital Libraries, Museums, and Archives _ _ _ No longer

From Digital Collections to Digital Libraries, Museums, and Archives _ _ _ No longer merely experiments Adhere to our fields’ traditions (access, interoperability, sustainable, privacy, …) Provide services Besser--Taiwan NDAP 3/5/05 7

To be a digital “library”, need at least some of American Library Association Core

To be a digital “library”, need at least some of American Library Association Core Values _ _ _ access confidentiality/privacy democracy diversity education and lifelong learning intellectual freedom preservation public good professionalism service social responsibility Besser--Taiwan NDAP 3/5/05 8

To respond to our needs for both Service & Traditions, we face the challenges

To respond to our needs for both Service & Traditions, we face the challenges of: Access (discovery) Sustainability (longevity) Interoperability Besser--Taiwan NDAP 3/5/05 9

Importance of Metadata Standards & Philosophies Besser--Taiwan NDAP 3/5/05 10

Importance of Metadata Standards & Philosophies Besser--Taiwan NDAP 3/5/05 10

For Interoperability, Repositories Need Standards (as well as Sustainability & Access) Descriptive Metadata for

For Interoperability, Repositories Need Standards (as well as Sustainability & Access) Descriptive Metadata for consistent description Discovery Metadata for finding Administrative Metadata for viewing and maintaining Structural Metadata for navigation . . . Terms & Conditions Metadata for controlling access. . . Besser--Taiwan NDAP 3/5/05 11

Why are Standards and Metadata consensus important? Managing digital files over time Longevity Interoperability

Why are Standards and Metadata consensus important? Managing digital files over time Longevity Interoperability Veracity Recording in a consistent manner Will give vendors incentive to create applications that support this Besser--Taiwan NDAP 3/5/05 12

Why Standards? Why – – – do we need a standards? To make information

Why Standards? Why – – – do we need a standards? To make information universally available to users facilitate sharing and interchange of information To preserve information (make it safe from changes in hardware and software) Standards are the work of communities They are necessary so that communities can work. Besser--Taiwan NDAP 3/5/05 13

Why are you Managing this Information? Organizational mission & type Users Uses Besser--Taiwan NDAP

Why are you Managing this Information? Organizational mission & type Users Uses Besser--Taiwan NDAP 3/5/05 14

Semantics/Syntax/Structure _ Semantics – meaning, as defined by a community to meet their particular

Semantics/Syntax/Structure _ Semantics – meaning, as defined by a community to meet their particular needs (DC) _ Syntax – a systematic arrangement of data elements for machine processing – facilitates the exchange and use of metadata among various applications (HTML, XML, RDF) _ Structure – a formal arrangement of the syntax with the goal of consistent representation of the semantics (rules defining field contents like 1/11/99) Besser--Taiwan NDAP 3/5/05 15

What is Metadata Types & Uses lots of different ways of dividing the clusters

What is Metadata Types & Uses lots of different ways of dividing the clusters Besser--Taiwan NDAP 3/5/05 16

Uses of Metadata _ _ _ _ Discovery & Retrieval Identification/Provenance Rights Management Viewing

Uses of Metadata _ _ _ _ Discovery & Retrieval Identification/Provenance Rights Management Viewing Integrity Longevity Content rating Besser--Taiwan NDAP 3/5/05 17

Metadata Encoding and Transmission Standard (METS) Derived from MOA II Administrative Metadata Structural Metadata

Metadata Encoding and Transmission Standard (METS) Derived from MOA II Administrative Metadata Structural Metadata Besser--Taiwan NDAP 3/5/05 18

METS Goal is Navigability and Interoperability Book example Besser--Taiwan NDAP 3/5/05 19

METS Goal is Navigability and Interoperability Book example Besser--Taiwan NDAP 3/5/05 19

METS Object

METS Object

METS Object

METS Object

METS Object

METS Object

METS/MOA II Metadata _ Administrative Metadata – for enhancing resource management _ Structural Metadata

METS/MOA II Metadata _ Administrative Metadata – for enhancing resource management _ Structural Metadata – for reflecting internal hierarchies and relationships btwn parts _ Raw/Seared/Cooked Besser--Taiwan NDAP 3/5/05 23

MOA II Best practices Use/Users/Collection: Benchmarking Masters vs. Derivatives Scanning Administrative Metadata Structural Metadata.

MOA II Best practices Use/Users/Collection: Benchmarking Masters vs. Derivatives Scanning Administrative Metadata Structural Metadata. Besser--Taiwan NDAP 3/5/05 24

Scanning Best Practices _ _ Think about users (and potential users), uses, and type

Scanning Best Practices _ _ Think about users (and potential users), uses, and type of material/collection Scan at the highest quality that does not exceed the likely potential users/uses/material Do not let today’s delivery limitations influence your scanning file sizes; understand the difference between digital masters and derivative files used for delivery Many documents which appear to be bitonal actually are better represented with greyscale scans Besser--Taiwan NDAP 3/5/05 _ _ _ Include color bar and ruler in the scan Use objective measurements to determine scanner settings (do NOT attempt to make the image good on your particular monitor or use image processing to color correct) Don’t use lossy compression Store in a common (standardized) file format Capture as much metadata as is reasonably possible (including metadata about the scanning process itself) 25

Why Scale is important Besser--Taiwan NDAP 3/5/05 26

Why Scale is important Besser--Taiwan NDAP 3/5/05 26

Administrative Metadata to uniquely identify a digital resource and manage it over time _

Administrative Metadata to uniquely identify a digital resource and manage it over time _ _ _ Information about where the various pieces/versions of the object reside Information to view the digital object Information about the scanning process Besser--Taiwan NDAP 3/5/05 27

Administrative Metadata -creation of a digital master image (recorded at point of capture) source

Administrative Metadata -creation of a digital master image (recorded at point of capture) source type source physical dimension source characteristics source ID Besser--Taiwan NDAP 3/5/05 Scanning date Scanner profile (and any adjusting to master image Light source 28

Administrative Metadata -what is needed to view or use the digital image (recorded from

Administrative Metadata -what is needed to view or use the digital image (recorded from header at point of capture) Resolution Bit-depth Type Color of Image File format Compression format Dimensions Besser--Taiwan NDAP 3/5/05 Lookup Table (CLUT) Color space 29

Administrative Metadata -linking the parts of a digital object or its instanciations, providing context

Administrative Metadata -linking the parts of a digital object or its instanciations, providing context (overlaps with Structural) Overall View of Image (NR) Sequence Number (NR) Sequence Total (NR) Version (from MESL Data Dictionary - NR) Version Date (NR) Besser--Taiwan NDAP 3/5/05 30

Administrative Metadata -ownership, rights, and reproduction info Owner(R) Display/Transmission Owner Restrictions (NR) License Term

Administrative Metadata -ownership, rights, and reproduction info Owner(R) Display/Transmission Owner Restrictions (NR) License Term (NR) License Begin Date (NR) License End Date (NR) Number NR) Copyright Date (NR) Credit Line (from MESL data Dictionary - NR) Copy/Distribution Restrictions (NR) Besser--Taiwan NDAP 3/5/05 31

Structural Metadata: that which is relevant to presentation of the digital object to the

Structural Metadata: that which is relevant to presentation of the digital object to the user _ _ metadata defining the "object”: a book, a diary, a photo album metadata defining the “sub-objects”: pages (physical) or chapters and subheads (intellectual) Besser--Taiwan NDAP 3/5/05 32

The Short Life of Digital Info: Digital Longevity Problems Disappearing Information The Viewing Problem

The Short Life of Digital Info: Digital Longevity Problems Disappearing Information The Viewing Problem The Scrambling Problem The Inter-relation Problem The Custodial Problem The Translation Problem Besser--Taiwan NDAP 3/5/05 33

The Viewing Problem Digital Info requires a whole infrastructure to view it Each piece

The Viewing Problem Digital Info requires a whole infrastructure to view it Each piece of that infrastructure is changing at an incredibly rapid rate How can we ever hope to deal with all the permutations and combinations Besser--Taiwan NDAP 3/5/05 34

The Scrambling Problem Dangers from: Compression to ease storage & delivery Container Architecture to

The Scrambling Problem Dangers from: Compression to ease storage & delivery Container Architecture to enhance digital commerce Besser--Taiwan NDAP 3/5/05 35

The Inter-relation Problem -Info is increasingly inter-related to other info -How do we make

The Inter-relation Problem -Info is increasingly inter-related to other info -How do we make our own Info persist when it points to and integrates with Info owned by others? -What is the boundary of a set of information (or even of a digital object)? Besser--Taiwan NDAP 3/5/05 36

The Custodial Problem In the past, much of survival was due to redundancy How

The Custodial Problem In the past, much of survival was due to redundancy How do we decide what to save? Who should save it? Mellon-funded E-Journal Archives How should they save it? - Besser--Taiwan NDAP 3/5/05 37

The Custodial Problem: How to save information? Methods for later access Refreshing Migration Emulation

The Custodial Problem: How to save information? Methods for later access Refreshing Migration Emulation Issues of authenticity and evidence Besser--Taiwan NDAP 3/5/05 38

The Translation Problem Content translated into new delivery devices changes meaning – – –

The Translation Problem Content translated into new delivery devices changes meaning – – – -A photo vs. a painting -If Info is produced originally in digital form in one encoded format, will it be the same when translated into another format? Behaviors Besser--Taiwan NDAP 3/5/05 39

The Translation Problem Thinking of the Future (1/2) _ _ _ Screens will be

The Translation Problem Thinking of the Future (1/2) _ _ _ Screens will be different resolutions and different aspect ratios CRTs won’t exist A decade or 2 from now, today’s user interfaces will look like arrow-key navigation looks like today Besser--Taiwan NDAP 3/5/05 40

The Translation Problem Thinking of the Future (2/2) _ _ _ Today’s streaming media

The Translation Problem Thinking of the Future (2/2) _ _ _ Today’s streaming media are small windows, slow speeds As bandwidth increases, viewers will expect higher quality streams Creators may need to consider how they’ll be able to deliver higher-bandwidth streams – Delivery Derivatives vs. Masters encoded w/standards – May also want to re-edit the piece to take advantage of changes in technology, viewer expectations, society- Besser--Taiwan NDAP 3/5/05 41

Pieces of the Solution (1/2) -We need to insist upon clearly readable standardized ways

Pieces of the Solution (1/2) -We need to insist upon clearly readable standardized ways for digital objects to selfidentify their formats -We should discourage scrambling -We need to better understand information inter-relates to other Info, and what constitutes “boundaries” of Info objects Besser--Taiwan NDAP 3/5/05 42

Pieces of the Solution (2/2) -People and organizations wishing to make information persist need

Pieces of the Solution (2/2) -People and organizations wishing to make information persist need guidelines of how to go about doing it -We need to better understand how translating from one storage or display format to another affects the meaning of a work -We need to save the “behaviors” of a digital object, not just it’s “contents” Besser--Taiwan NDAP 3/5/05 43

Metadata can be the first line of defense Can – – – tell you

Metadata can be the first line of defense Can – – – tell you where the file is (if you can’t find the file) where more info about the file is (if you have the file but most other metadata has become separated) what the file format is what the compression scheme is what application program and version is needed for the file Besser--Taiwan NDAP 3/5/05 44

Conceptual Approaches to Digital Preservation _ Refreshing always necessary due to volatility of physical

Conceptual Approaches to Digital Preservation _ Refreshing always necessary due to volatility of physical strata – Impact on evidential value _ _ _ Migration -- advantages & disadvantages Emulation -- advantages & disadvantages And will need a long-term managed environment Besser--Taiwan NDAP 3/5/05 45

Managed Environment _ _ More than temperature & humidity control Periodic monitoring of the

Managed Environment _ _ More than temperature & humidity control Periodic monitoring of the works Periodic monitoring of the technical environment for viewing the works (software, systems, hardware) Trusted repositories Besser--Taiwan NDAP 3/5/05 46

Migration/Refreshing Impact on evidential value Besser--Taiwan NDAP 3/5/05 47

Migration/Refreshing Impact on evidential value Besser--Taiwan NDAP 3/5/05 47

Impediments to Sustainability § Obsolete: § File formats § Software § Copyright laws that

Impediments to Sustainability § Obsolete: § File formats § Software § Copyright laws that prevent: § Migrating content into new file formats without prior permission § Simulating obsolete software (reverse engineering) Besser--Taiwan NDAP 3/5/05 48

Important Approaches for Sustainability § § Non-proprietary file formats Open Source software Besser--Taiwan NDAP

Important Approaches for Sustainability § § Non-proprietary file formats Open Source software Besser--Taiwan NDAP 3/5/05 49

Preservation Repositories: Open Archival Info System Model Consumer Producer Management Besser--Taiwan NDAP 3/5/05 50

Preservation Repositories: Open Archival Info System Model Consumer Producer Management Besser--Taiwan NDAP 3/5/05 50

OCLC/RLG Digital Repository Attributes _ _ _ Administrative responsibility Organizational viability Financial sustainability Technological

OCLC/RLG Digital Repository Attributes _ _ _ Administrative responsibility Organizational viability Financial sustainability Technological suitability System security Procedural accountability Besser--Taiwan NDAP 3/5/05 51

Preservation Repositories: Projects based on OAIS Model CEDARS NEDLIB Pandora CDL OCLC/RLG Working Group

Preservation Repositories: Projects based on OAIS Model CEDARS NEDLIB Pandora CDL OCLC/RLG Working Group on Preservation Metadata, Attributes of a Trusted Digital Repository, August 2001 - Besser--Taiwan NDAP 3/5/05 52

OCLC/RLG Selected Recommendations _ _ _ Policies, Certification processes, Risk management, Persistent ID, Migration/Emulation

OCLC/RLG Selected Recommendations _ _ _ Policies, Certification processes, Risk management, Persistent ID, Migration/Emulation experiments Stakeholders meet to decide how to describe what is in a dig repository Examine special properties of particular classes of digital objects Technical standards for exchange and interoperability btwn repositories Develop projects and case studies Copyright issues Besser--Taiwan NDAP 3/5/05 53

OCLC/RLG Efforts Working Group I: Preservation Metadata Framework _ _ …to define the concept

OCLC/RLG Efforts Working Group I: Preservation Metadata Framework _ _ …to define the concept of preservation metadata, describe its importance in context of the overall digital preservation process, examine the "state-of the-art" in the use of metadata in support of digital preservation, and evaluate the prospects for a communitywide, consensus-building activity in the area of preservation metadata (Preservation Metadata for Digital Objects: A Review of the State of the Art http: //www. oclc. org/research/pmwg/presmeta_wp. pdf) …to develop a framework outlining the types of information—i. e. , metadata—that should be associated with an archived digital object. (A Metadata Framework to Support the Preservation of Digital Objects http: //www. oclc. org/research/pmwg/pm_framework. pdf) – – an expanded conceptual structure for the Open Archival Information System (OAIS) information model, and a set of metadata elements, mapped to the conceptual structure and reflecting the information concepts and requirements articulated in the OAIS model. Besser--Taiwan NDAP 3/5/05 54

OCLC/RLG Efforts Working Group II: PREservation Implementation Strategies (PREMIS) _ _ _ develop a

OCLC/RLG Efforts Working Group II: PREservation Implementation Strategies (PREMIS) _ _ _ develop a core set of implementable preservation metadata elements, with broad applicability within the digital preservation community develop a data dictionary to support the preservation metadata element set examine and evaluate alternative strategies for the encoding, storage, and management of preservation metadata within a digital preservation system, as well as for the exchange of preservation metadata between systems develop a pilot program for testing the group’s recommendations and best practices in a variety of systems settings explore opportunities for the cooperative creation and sharing of preservation metadata Besser--Taiwan NDAP 3/5/05 55

LC’s National Digital Information Infrastructure and Preservation Program (NDIIP) _ _ _ Authorized Dec

LC’s National Digital Information Infrastructure and Preservation Program (NDIIP) _ _ _ Authorized Dec 2000 LC, Dept of Commerce, NARA, White House Office of Sci & Tech Policy with help from CLIR, NLM, NAL, OCLC, RLG Ongoing collab process Commissioned papers on preserving: the Web, periodicals, digital sound, E-Books, Digital TV, Digital Video Awarded 8 grants for “Building a Network of Partners” phase (up to $3 million) (Oct 2004) Besser--Taiwan NDAP 3/5/05 56

CLIR/NDIIP Issue Areas _ _ Technical and architectural infrastructure (standards, ID, obsolescence) Economic and

CLIR/NDIIP Issue Areas _ _ Technical and architectural infrastructure (standards, ID, obsolescence) Economic and legal (rights mgmt, funding) Collection Development (what gets saved? ) Societal & Institutional (who does what, role for commercial sector) Besser--Taiwan NDAP 3/5/05 57

Journal Archiving _ _ License, don’t own; may not be even able to obtain

Journal Archiving _ _ License, don’t own; may not be even able to obtain right to make archival copy Increasingly no paper back-up at all Usually we don’t have the important redundancy factor Stanford’s LOCKSS Project (Lots of Copies Keeps Stuff Safe) and its problems (http: //lockss. stanford. edu) Besser--Taiwan NDAP 3/5/05 58

NISO/DLF Technical Image Metadata Workshop--4/99 (Z 39. 87 -2002 draft) § § § create

NISO/DLF Technical Image Metadata Workshop--4/99 (Z 39. 87 -2002 draft) § § § create metadata needed to manage images in digital repositories over long periods of time (full life-cycle mgmt) document image provenance & history ensure that the images will be rendered accurately on any output device Besser--Taiwan NDAP 3/5/05 59

Important Planning Considerations File Formats Choosing Interoperable Systems Adhere to standards Vendors with large

Important Planning Considerations File Formats Choosing Interoperable Systems Adhere to standards Vendors with large installed base Refreshing and/or Migration Besser--Taiwan NDAP 3/5/05 60

One Final Question: Who will collect the digital works of today that should become

One Final Question: Who will collect the digital works of today that should become the Special Collections of tomorrow? _ _ _ web sites zines electronic journals listserve and email discussions drafts of works that later become famous Besser--Taiwan NDAP 3/5/05 61

Paradigms Shifts needed Old New Physical preservation atmospheric cntrl ongoing mgmt What to save?

Paradigms Shifts needed Old New Physical preservation atmospheric cntrl ongoing mgmt What to save? artifact idea + ancillary material & documentation Cataloging Individual work in hand Artifact & documentation FRBR Later access Besser--Taiwan NDAP 3/5/05 Restaging, ancillary material & documentation 62

Traditional Digital Library Model DL search & presentation DL DL search & presentation user

Traditional Digital Library Model DL search & presentation DL DL search & presentation user Besser--Taiwan NDAP 3/5/05 DL search & presentation user 63

Ideal Digital Library Model DL DL search & presentation user Besser--Taiwan NDAP 3/5/05 user

Ideal Digital Library Model DL DL search & presentation user Besser--Taiwan NDAP 3/5/05 user 64

Pushing the Envelope: Preservation of Multimedia Materials Howard Besser, NYU Moving Image Archiving &

Pushing the Envelope: Preservation of Multimedia Materials Howard Besser, NYU Moving Image Archiving & Preservation Program _ http: //sunsite. berkeley. edu/Longevity/ _ http: //www. firstmonday. dk/issues/issue 7_6/besser/ http: //www. tisch. nyu. edu/preservation http: //www. oclc. org/digitalpreservation/presmeta_wp. pdf http: //www. interpares. org UC Libraries Systemwide Operations and Planning Advisory Group (SOPAG) Site http: //www. slp. ucop. edu/sopag/ for the UC Digital Preservation & Archiving Committee Final Report _ _ _ http: //variablemedia. net/ http: //www. getty. edu/gri/standard/intrometadata/ http: //www. gseis. ucla. edu/~howard/Metadata/UC-May 00/ http: //sunsite. berkeley. edu/Metadata/sp 2000. html http: //www. niso. org/commitau. html http: //www. ifla. org/II/metadata. htm METS official site: http: //www. loc. gov/standards/mets

Besser--Taiwan NDAP 3/5/05 66

Besser--Taiwan NDAP 3/5/05 66

Incorporate parts of Functional Requirements for Bibliographic Records (FRBR) _ _ work expression manifestation

Incorporate parts of Functional Requirements for Bibliographic Records (FRBR) _ _ work expression manifestation item Besser--Taiwan NDAP 3/5/05 67

Approaches to Solutions_ _ _ Save the Hardware & Software Emulate Migrate Besser--Taiwan NDAP

Approaches to Solutions_ _ _ Save the Hardware & Software Emulate Migrate Besser--Taiwan NDAP 3/5/05 68