SIPS DIPS and Trips How we will know

  • Slides: 58
Download presentation

“SIPS, DIPS and Trips: How we will know if we've collected enough, or the

“SIPS, DIPS and Trips: How we will know if we've collected enough, or the right, metadata? ” • George Blood Audio, LP • Safe Sound Archive Intellectual Access to Preservation Metadata Interest Group American Library Association June 2010

Definition by ALA PARS Digital Preservation: “Digital preservation combines policies, strategies and actions to

Definition by ALA PARS Digital Preservation: “Digital preservation combines policies, strategies and actions to ensure access to reformatted and born digital content regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time. ”

In the words of Grace Hopper. . • “It's easier to ask forgiveness than

In the words of Grace Hopper. . • “It's easier to ask forgiveness than it is to get permission” • “A ship in a harbor is safe, but that is not what a ship is built for” • “From then on, when anything went wrong with a computer, we said it had bugs in it” • “You manage things; you lead people”

"The great thing about standards is that there are so many to choose from.

"The great thing about standards is that there are so many to choose from. "

Standards are like toothbrushes. Everyone agrees they're desirable…

Standards are like toothbrushes. Everyone agrees they're desirable…

Standards are like toothbrushes. Everyone agrees they're desirable… but nobody wants to use someone

Standards are like toothbrushes. Everyone agrees they're desirable… but nobody wants to use someone else's.

Why are we collecting all this metadata? • • • To provide for discovery

Why are we collecting all this metadata? • • • To provide for discovery To manage the files To provide provenance To provide authenticity Etc.

Metadata • • = Cataloging and Description How much is enough? Is it possible

Metadata • • = Cataloging and Description How much is enough? Is it possible to have too much? Why do we need more than we did before? – Are we moving the goal posts? – To what extent are our neuroses about digital preservation a reflection of our failures in analog preservation? – Is more metadata less product? By doing “better” for one object are we preserving less overall? • Has anyone asked the users what they need?

Organizing metadata • “Standards” • Toothbrushes

Organizing metadata • “Standards” • Toothbrushes

What is a standard? • How widely adopted? • If everyone is doing something.

What is a standard? • How widely adopted? • If everyone is doing something. . . is that good enough to be a “standard”? • Does a standard have to be perfect? • Does one size fit all? • If there’s a standard and no one uses it, what’s it matter? • What are the implications if there’s a standard and it is “locally modified”? • If you make your own “standard”, in what ways does this enhance or inhibit preservation and long-term access? – Aren’t we taught to avoid proprietary solutions? Why not for metadata?

SIPS: The State of the Art

SIPS: The State of the Art

Oberlin metadata

Oberlin metadata

NYPL - LPA metadata

NYPL - LPA metadata

UMichigan RFI

UMichigan RFI

SI AAA Metadata

SI AAA Metadata

SI AAA Second Project

SI AAA Second Project

SI Hirshhorn and SI AAA Sample Rate: 96000 Bit Depth: 24 Duration: 0: 42:

SI Hirshhorn and SI AAA Sample Rate: 96000 Bit Depth: 24 Duration: 0: 42: 19 Duration: 0: 56: 32 INFO Name: Hess, Thomas B. "The Breakthrough of Abstract Expressionism. " INFO Name: INFO Artist: INFO Date: 20090908 INFO Date: INFO Archival Location: Smithsonian Institution Libraries, Hirshhorn Museum Library INFO Archival Location: INFO Copyright: Material may be protected by copyright. Restrictions may apply. INFO Copyright: BEXT Description: Hess, Thomas B. "The Breakthrough of Abstract Expressionism. " Lecture at NGA, 11 -4 -73: 0001, File Identifier; HMSG 0001 A-B, Tape Identifier BEXT Description: Oral history interview with Tony Rosenthal, 1968 May 10 -June 29. ; Tony; Sevim; 1968 May 10 -June 29 BEXT Originator: Hirshhorn Museum Library BEXT Originator Reference: BEXT Origination Date: 2009 -09 -08 BEXT Time Reference: 0 BEXT Version: 1 BEXT Coding History: A=ANALOG, M=stereo, T=Nakamichi_Dragon; 09095; TDK_C 90 A=PCM, F=96000, W=24, M=stereo, T=Prism. Sound; ADA-8 XR; A/D A=PCM, F=96000, W=24, M=dual-mono, T=Metric. Halo; ULN-2; DIO A=PCM, F=96000, W=24, M=stereo, T=So. X 14. 1; DAE BEXT Originator: Smithsonian Institution BEXT Originator Reference: Archives of American Art BEXT Origination Date: 2009 -09 -22 BEXT Time Reference: 0 BEXT Version: 1 BEXT Coding History: A=ANALOG, M=mono, T=Revox_A 700; 13652; Audiotape_1251 A=PCM, F=96000, W=24, M=mono, T=Prism. Sound; ADA-8 XR; A/D A=PCM, F=96000, W=24, M=mono, T=Metric. Halo; ULN-2; DIO A=PCM, F=96000, W=24, M=mono, T=So. X 14. 1; DAE

CUL METS

CUL METS

How will any of this provide for discovery, management, provenance, etc? • It all

How will any of this provide for discovery, management, provenance, etc? • It all has to be done manually. • It is just as much work to create software tools to read the metadata as to make it. • It costs more to do the metadata work on some projects than the digitization. • What will be the cost to reformat the metadata when the digital file is migrated?

Open Source! Open Standards!! Interoperability!!! Except MY Metadata

Open Source! Open Standards!! Interoperability!!! Except MY Metadata

DIPs: Let’s get religion

DIPs: Let’s get religion

A return to basics • When does a record end and context begin? •

A return to basics • When does a record end and context begin? • When does the archive end and the research begin? • What is the (end) goal of metadata? • What is the end (goal) of metadata?

Ernie Ingles • “Long term preservation of information has plagued mankind since we first

Ernie Ingles • “Long term preservation of information has plagued mankind since we first etched images into stone tablets. And in many ways it’s been downhill every since. ” • “We should think of preservation with a 500 year time horizon. ”

Quakerism 101

Quakerism 101

K. I. S. S. Keep It Stupid Simple Keep It Simple, Stupid

K. I. S. S. Keep It Stupid Simple Keep It Simple, Stupid

Pareto’s Principle • 80% of effect comes from 20% of the causes – –

Pareto’s Principle • 80% of effect comes from 20% of the causes – – “ 80% of your revenue comes from 20% of your clients” “ 80% of a project can be completed with 20% of your time” “ 80% of total circulation comes from 20% of the books” “ 80% of knowledge can be acquired with 20% of the information”

Short Record Dublin Core MARC

Short Record Dublin Core MARC

 • 20100623 • Jun. 23 2010 • June 23, 2010 • Etc.

• 20100623 • Jun. 23 2010 • June 23, 2010 • Etc.

Date field conversion, Date to number, On Mac, PC, FMP, Different Version

Date field conversion, Date to number, On Mac, PC, FMP, Different Version

Sample Rate: 96000 Bit Depth: 24 Duration: 0: 42: 19 INFO Name: Hess, Thomas

Sample Rate: 96000 Bit Depth: 24 Duration: 0: 42: 19 INFO Name: Hess, Thomas B. "The Breakthrough of Abstract Expressionism. " INFO Artist: INFO Date: 20090908 INFO Archival Location: Smithsonian Institution Libraries, Hirshhorn Museum Library INFO Copyright: Material may be protected by copyright. Restrictions may apply. BEXT Description: Hess, Thomas B. "The Breakthrough of Abstract Expressionism. " Lecture at NGA, 11 -4 -73: 0001, File Identifier; HMSG 0001 A-B, Tape Identifier BEXT Originator: Hirshhorn Museum Library BEXT Originator Reference: BEXT Origination Date: 2009 -09 -08 BEXT Time Reference: 0 BEXT Version: 1 BEXT Coding History: A=ANALOG, M=stereo, T=Nakamichi_Dragon; 09095; TDK_C 90 A=PCM, F=96000, W=24, M=stereo, T=Prism. Sound; ADA-8 XR; A/D A=PCM, F=96000, W=24, M=dual-mono, T=Metric. Halo; ULN-2; DIO A=PCM, F=96000, W=24, M=stereo, T=So. X 14. 1; DAE

1. Achieve consensus on a standard 2. K. I. S. S. 3. Expose more

1. Achieve consensus on a standard 2. K. I. S. S. 3. Expose more complexity only as needed

Conformance to Standards within the model Layer 1: Required Layer 2: Recommended Layer 3:

Conformance to Standards within the model Layer 1: Required Layer 2: Recommended Layer 3: Optional

How much is enough? How much is being left behind? - 80% of information

How much is enough? How much is being left behind? - 80% of information is available in 20% of the data - 80% isn’t good enough If we apply Pareto to the remaining information, the Next 20% of effort yields 80% of the remaining Information. 80% of 20% is 16% First 80% plus the next 16% is 96% of total information.

Conformance to Standards within the model Layer 1: Required Layer 2: Recommended Layer 3:

Conformance to Standards within the model Layer 1: Required Layer 2: Recommended Layer 3: Optional Layer 1: Consensus Layer 2: Structured Variety Layer 3: Whoopie!

ALA Definition of Digital Preservation Parallel to Definition of Digital Preservation Layer 1: Short,

ALA Definition of Digital Preservation Parallel to Definition of Digital Preservation Layer 1: Short, clear, quick Layer 2: Most useful in most circumstances Layer 3: Everything to everybody

Challenge to the Group: (a la Definition of Digital Preservation) - Convene a Task

Challenge to the Group: (a la Definition of Digital Preservation) - Convene a Task Force - Develop standards for DIPs - Present version 0. 9 (draft) at this Interest Group - at ALA Mid. Winter 2011