Designing Digital Libraries Museums Archives Howard Besser NYU
Designing Digital Libraries, Museums, Archives Howard Besser NYU Archiving and Preservation Program and Library Senior Scientist http: //www. tisch. nyu. edu/preservation http: //www. gseis. ucla. edu/~howard Besser--Missouri Digitization 2/18/03 1
Designing Digital Libraries, Museums, Archives Models for Digital Repositories Importance of Metadata Standards & Philosophies Introduction Discovery Metadata: The Dublin Core Administrative & Structural Metadata; Digital Object Standards (METS) Fitting METS in--Content Management Content Format Standards (Images) Longevity & Preservation Repositories Other Elements Actors Metadata Preserving Electronic Art. . . Besser--Missouri Digitization 2/18/03 2
Columbia Library Besser--Missouri Digitization 2/18/03 3
Using Online Catalog for… Besser--Missouri Digitization 2/18/03 4
Library Workstations for… Besser--Missouri Digitization 2/18/03 5
Models for Digital Repositories Besser--Missouri Digitization 2/18/03 6
From Digital Collections to Digital Libraries, Museums, and Archives • No longer merely experiments • Adhere to our fields’ traditions (access, interoperability, sustainable, privacy, …) • Provide services Besser--Missouri Digitization 2/18/03 7
To respond to our needs for both Service & Traditions, we face the challenges of: Access (discovery) Sustainability (longevity) Interoperability- Besser--Missouri Digitization 2/18/03 8
Serious Longevity Problems What we know from prior widespread digital file formats Images separating from their metadata Inaccessibility of software needed to view an image Inability to even decode the file format of an image Besser--Missouri Digitization 2/18/03 9
Traditional Digital Repository Model DL search & presentation DL DL search & presentation user
Ideal Digital Repository Model DL DL search & presentation user
Importance of Metadata Standards & Philosophies Besser--Missouri Digitization 2/18/03 12
For Interoperability, Repositories Need Standards (as well as Sustainability & Access) Descriptive Metadata for consistent description Discovery Metadata for finding Administrative Metadata for viewing and maintaining Structural Metadata for navigation . . . Terms & Conditions Metadata for controlling access. . . Besser--Missouri Digitization 2/18/03 13
Why are Standards and Metadata consensus important? Managing digital files over time Longevity Interoperability Veracity Recording in a consistent manner Will give vendors incentive to create applications that support this Besser--Missouri Digitization 2/18/03 14
Philosophical Metadata Decisions • Warwick vs MARC • Where to put the metadata Besser--Missouri Digitization 2/18/03 15
Containers and Packages of Metadata Warwick, not MARC • modular • overlapping • extensible • community-based • designed for a networked world to aid commonality btwn communities while still providing full functionality within each community Besser--Missouri Digitization 2/18/03 16
Some different schemes where Metdata is kept • embedded within the object (TIFF headers) • encapsulated with image (MOA 2/METS) • in a separate related DB maintained by same organization (OPAC) • in a separate DB maintained by a separate organization (Books in Print, ratings systems) Besser--Missouri Digitization 2/18/03 17
Discovery Metadata • Dublin Core - NISO Z 39. 85 (3/95) • CBIR (ongoing) Besser--Missouri Digitization 2/18/03 18
Dublin Core--further work • Warwick Framework – – metadata packages for extensible functions layed groundwork for RDF • Canberra Qualifiers – – refining the semantics of the element set to provide more precise info SUBELEMENT, SCHEME, LANG • Granularity – no hierarchical relationships w/i a given DC record; only one record per discrete object (collection or item-level), and relationship field plus qualifier links them Besser--Missouri Digitization 2/18/03 19
The Research Process and Functional Categories of Metadata • Discovery • Retrieval • Collation • Analysis • Re-presentation
Metadata Mapping Crosswalks Resource Description Framework (RDF) Open Archives & metadata harvesting Besser--Missouri Digitization 2/18/03 21
Crosswalks mapping btwn differing metadata structures eliminate the need for monolithic, universally adopted standards focus on flexibility and interoperatiblity RDF-based metadata registries Besser--Missouri Digitization 2/18/03 22
Crosswalk Example Besser--Missouri Digitization 2/18/03 23
Resource Description Framework (RDF, spec released 2/99) • W 3 C Metadata activity • designed to move the Web beyond simple links to • • semantically-rich relationships btwn resources metadata application using XML as a common syntax for exchange and processing flexible architecture for managing diverse applicationspecific metadata packets that can be processed by machines associates resources, property types, and corresponding values http: //www. w 3. org/RDF/ Besser--Missouri Digitization 2/18/03 24
RDF • Resources (character strings, names, digital objects) • Property (“is the author of”) • Value • resources+properties=relationships • many different relationships can be reflected Besser--Missouri Digitization 2/18/03 25
XML-encoded RDF • <? xml: namespace ns=http: //www. w 3. org/RDF prefix="RDF" ? > • <? xml: namespace ns=http: //purl. oclc. org/DC/ prefix="DC" ? > • <RDF: RDF> • <DC: Creator>Howard Besser</DC: Creator> • </RDF: Description> • </RDF: RDF> Besser--Missouri Digitization 2/18/03 26
Open Archives & metadata harvesting Besser--Missouri Digitization 2/18/03 27
Standardized Digital Objects METS Metadata Encoding & Transfer Syntax (slides courtesy of Bernie Hurley, UCB Library Chief Scientist) Besser--Missouri Digitization 2/18/03 28
Structural & Administrative Metadata • Not enough to merely capture still images (book example) • Must capture Behaviors Besser--Missouri Digitization 2/18/03 29
What is a “Digital Object? ” • Combined Digital Content & Metadata – Digital Content • Digitized materials -- photographs, page images from a book, maps, digitized audio or video… • Born Digital – GIS maps, digitally captured audio or video, numeric datasets (census files, scientific dataset), Web sites… – Metadata • • Descriptive Administrative Structural Behavior Besser--Missouri Digitization 2/18/03 30
What is METS? • An XML Schema that is used to Encode all the Content and Metadata for a Digital Object – The relationships between content and metadata are also captured • METS Object -- METS Document • A METS Document can be – A single file with all content & metadata – A “hub document” that points to content and metadata – A combination of the above Besser--Missouri Digitization 2/18/03 31
Uses of METS • Transfer Syntax – Standard for transmitting/ exchanging digital objects. – SIP (Open Archival Information Systems Reference Model) • Functional Syntax – basis for providing end users with the ability to view and navigate digital content and its associated metadata – DIP • Archiving Syntax – standard for archiving digital objects. – AIP Besser--Missouri Digitization 2/18/03 32
Why Is METS Important? • Interoperability – Share objects between digital library systems – Allow a DL to work with objects from other repositories • Scalability – Same software can be used to index, navigate and display different content types • E. g. , book, diary, scrapbook, music score, etc. • Preservation – Aids Migration Strategies Besser--Missouri Digitization 2/18/03 33
History of METS • Originates in Making of America II Initiative – Making of America II (MOA 2) was a NEH funded Digital Library Federation initiative started in 1997. Participants included UC Berkeley (lead), Stanford, Penn State, Cornell, and NYPL. – GOAL: to create a digital object standard for encoding structural, descriptive and administrative metadata along with primary content – RESULT: MOA 2. DTD (an XML DTD) • Adopted by UC Libraries Besser--Missouri Digitization 2/18/03 34
History of METS (cont’d) • Concerned Parties Meet at NYU in February, 2001 to Discuss Future of MOA 2 – Additional needs emerge • Support for time-based content • More flexibility in Descriptive and Administrative metadata – Outcome • MOA 2 revised & renamed to METS • Outcome: mets. xsd is endorsed by DLF • METS Governance Structure – Editorial Board, Jerry Mc. Donough is Chair • RLG coordinates editorial board activities • Library of Congress is the Maintenance Agency for METS Besser--Missouri Digitization 2/18/03 35
A Partial List of Organizations that Plan to Use METS • California Digital Library • UC Berkeley • Library of Congress (A/V project) • Harvard • NYU • Stanford • MIT • Meta. E (Metadata Engine Project: R&D project funded by the European Commission) • British Library Besser--Missouri Digitization 2/18/03 36
Display of METS Objects Besser--Missouri Digitization 2/18/03 37
How Does METS Work? • METS uses XML to 1) Identify the digital pieces (files) that together comprise a digital object • Scrapbook: Digitized pages, photographs, newspaper clippings, digital audio, etc. 2) Specify the location of these pieces • Are we pointing to these files? • Are they embedded in the METS document? • A combination of the above? Besser--Missouri Digitization 2/18/03 41
Express structural relationships between: [Think of the “structure” as a “Table of Contents”] • Content files – Links the proper content files to the TOC entry for the scrapbook’s cover, page 1, page 2, the photo on page 20, the DVD on page 50, etc. • Descriptive Metadata (DM) – Links the proper DM entries to the TOC, so you can have separate DM entries for the scrapbook, photos, audio DVDs… • Administrative Metadata (AM) – Links AM entries to the TOC or to files (e. g. , links rights MD to a photo, Tech. MD to a group of files) • Behaviors – Links the proper behaviors to TOC entries (e. g. , links program to run the audio to the DVD TOC entry) Besser--Missouri Digitization 2/18/03 42
Anatomy of METS File METS Descriptive Admin. Structural Behavior Header Metadata Inventory Map Metadata (Optional) Besser--Missouri Digitization 2/18/03 (Optional, but typical) (Required) (Optional) 43
1. METS Header • Records Administrative Metadata about the METS Document itself, such as – Author/agent & agent role • E. G. , UC Berkeley Library as custodian – Alternate identifiers for METS document – Creation and updates and times – Status Besser--Missouri Digitization 2/18/03 44
2. Structural Map Section(s) • Specifies the Structure of the Digital Object as a Hierarchy of Division (div) Elements Division (type=“scrapbook”) Division (type=“page”) Division (type=“photo”) Division (type=“digital audio file”) Division (type=“page”) Division (type=“letter”) Division (type=“photo”) Division (type=“newspaper clipping”) Besser--Missouri Digitization 2/18/03 45
3. File Section • Records all of the Files that Together Comprise the Content of the Digital Object – Files may be internal or external to the METS document (or both) • Files are organized into File Groups based on format (tiff, hi-res jpeg, med-res jpeg, gif, etc) • Files are linked to the Structural Map Besser--Missouri Digitization 2/18/03 46
3. File Section (cont. ) • Scrapbook Example (a complex object) – 100 Digitized pages with text entries • Three images per page (GIF, JPEG, TIFF) • Transcribed text for each page – Photos and newspaper clippings attached to the pages – Envelopes glued to the pages that hold • Letters & cards • DVDs Besser--Missouri Digitization 2/18/03 47
4. Descriptive Metadata Section(s) • METS can Record all of the Units of Descriptive Metadata Pertaining to the Digital Object – Multiple Descriptive Metadata Sections can Exist in a METS Document • Descriptive Metadata – could take any form • E. g. , a MARC or Dublin Core record, Finding Aid – May be • Internal or external to the METS document (or both) Besser--Missouri Digitization 2/18/03 48
5. Administrative Metadata Section(s) • 4 Flavors of Admin. Metadata Per Section – Technical metadata – Source Metadata – Rights Metadata – Digital Provenance Metadata • Admin. Metadata may be – Internal or external to the METS document (or both) – Linked to files or file groups, or the structural map Besser--Missouri Digitization 2/18/03 49
6. Behavior Section • Behavior Sections Identity Software that can be used with the Digital Object, or its Parts – E. g. , Software to View the Complex Digital Object which is the Scrapbook; Software to listen to the DVD • A Behavior Unit May Contain: – A reference to an external interface definition that defines a set of related behaviors – A reference to an external executable that implements these behaviors – A reference to the Division or Divisions of the object structure to which the behaviors apply. Besser--Missouri Digitization 2/18/03 50
Some Characteristics of the New Information Environment • Increased Quantity of Information – With the Web, everyone can become a publisher – Varying level of quality • Digital Libraries Need to Work With New Classes of Information – Web Pages, Museum Artifacts, GIS, Statistical Information, etc. Besser--Missouri Digitization 2/18/03 51
Characteristics of the New Information Environment (Cont. ) • Information is Decentralized – Distributed repositories • Information is in Proprietary Formats – Everyone has their own method of creating a digital book, journal, manuscript, Etc. How Do We Cope? ? Besser--Missouri Digitization 2/18/03 52
Defining Digital Libraries in the NIE • A Series of Collaborating Services & Systems that Allow for the Discovery, Display, Maintenance and Preservation of Complex Digital Objects – The Traditional ILS • Created to manage physical materials • Almost all metadata is descriptive (e. g. , MARC) – Digital Libraries • Created to manage complex digital objects • New types of metadata (administrative, structural, etc. ) • New Services (content management, digital preservation) 53 Besser--Missouri Digitization 2/18/03
Complex Digital Objects • Scrapbook Example – Digitized pages with text entries – Photos and newspaper clippings attached to the pages – Envelopes glued to the pages that hold • Letters & cards • DVDs • The Scrapbook has – Multiple material types (text, image, audio) – Structure (e. g. , like a table of contents) – Internal Relationships • The DVD on page 5 is linked to the file that is the DVD content and to its descriptive metadata Besser--Missouri Digitization 2/18/03 54
“A Series of Collaborating Services” • Content Management Systems (CMS) – Create & maintain complex digital objects • Preservation Repositories – Long-term retention of digital objects • Access Systems & Integration – Global Access Portals – Subject Access Portals – Material Type Portals Besser--Missouri Digitization 2/18/03 55
How Can These Systems Collaborate? • Via “Standardized Digital Objects” – A means to “wrap-up” a digital object and send it to another system or repository • Same idea as MARC, but for entire digital objects • E. g. , A CMS sending a digital object to a Preservation Repository • The METS Digital Object Standard – Metadata Encoding and Transmission Standard Besser--Missouri Digitization 2/18/03 56
Illustrative Digital Library Services Diagram Global Access Portal Material Type Portal [books] Material Type Portal [images] Material Type Portal [fossils] METS Content Management Preservation Repository METS Preservation Repository Besser--Missouri Digitization 2/18/03 Content Management METS 57
Content Management Systems Besser--Missouri Digitization 2/18/03 58
Content Management Systems • Used to… – Create and edit digital objects – Import & export digital objects – Manage objects (acquire, inventory, validate) • Content Management Systems will Vary Depending on the Materials they Support – Metadata schemes will vary • Descriptive Metadata – MARC/MODS/Dublin Core for Books – Code books for numeric datasets • Administrative Metadata – Images, audio, test, etc. Besser--Missouri Digitization 2/18/03 59
Content Format Standards (Images) Besser--Missouri Digitization 2/18/03 60
Images • Content Format & Best Practices • Identification/Provenance • Technical Imaging metadata • Special discovery & descriptive metadata Besser--Missouri Digitization 2/18/03 61
Best practices Use/Users/Collection: Benchmarking Masters vs. Derivatives Scanning Administrative Metadata Structural Metadata- Besser--Missouri Digitization 2/18/03 62
Scanning Best Practices • • Think about users (and potential users), uses, and type of material/collection Scan at the highest quality that does not exceed the likely potential users/uses/material Do not let today’s delivery limitations influence your scanning file sizes; understand the difference between digital masters and derivative files used for delivery Many documents which appear to be bitonal actually are better represented with greyscale scans Besser--Missouri Digitization 2/18/03 • • • Include color bar and ruler in the scan Use objective measurements to determine scanner settings (do NOT attempt to make the image good on your particular monitor or use image processing to color correct) Don’t use lossy compression Store in a common (standardized) file format Capture as much metadata as is reasonably possiple (including metadata about the scanning process itself) 63
Why Scale is important Besser--Missouri Digitization 2/18/03 64
Identification/Provenance (Images) The number of variant forms of a work can be enormous Image Families A digital image frequently has many layers of parentage Information about the parentage that can indicate the quality and veracity of the image (Dublin Core "Source" and "Relation") how to deal with different versions derived from the same scan or different encoding schemes Vocabulary Standards to express this Besser--Missouri Digitization 2/18/03 65
The number of variant forms of a work can be enormous different views of the same object different scans of the same photo different resolutions different compression schemes different compression ratios different file storage formats different details of the same image . . . Besser--Missouri Digitization 2/18/03 66
Image Families
Identification/Provenance how to deal with different versions (browse, hi-res, medium res) derived from the same scan or different encoding schemes (TIFF, PICT, JFIF) Vocabulary Standards to express this – – VRA Surrogate Categories CIMI's "Image Elements” Besser--Missouri Digitization 2/18/03 68
Incorporate parts of Functional Requirements for Bibliographic Records (FRBR) • work • expression • manifestion • item • (and push into “change history” section of Technical Image Metadata) Besser--Missouri Digitization 2/18/03 69
NISO/DLF Technical Image Metadata Workshop--4/99 (Z 39. 87 -2002 draft) create metadata needed to manage images in digital repositories over long periods of time (full life-cycle mgmt) document image provenance & history ensure that the images will be rendered accurately on any output device Besser--Missouri Digitization 2/18/03 70
Technical Image Metadata Focus on Metadata that may prove helpful for management use preservation . . . Besser--Missouri Digitization 2/18/03 71
Technical Image Metadata In Scope still, bit-mapped pictorial images scanned/reformatted images (+ born digital) Besser--Missouri Digitization 2/18/03 72
Technical Image Metadata Out of Scope vector images moving images of OCR-able text structural and hierarchical relationships between images rights management, terms of use (authenticity/security) Besser--Missouri Digitization 2/18/03 73
Technical Image Metadata -Z 39. 87 Image parameters (MIME type, compression, colorspace & profile, …) Image Creation (source, capture info, etc. ) Image performance assessment (sampling, colormap, whitepoint, target data, etc. ) Change history (source, processing, etc. ) Besser--Missouri Digitization 2/18/03 74
Technical Image Metadata -Z 39. 87 additional XML implementation schema (MIX) Besser--Missouri Digitization 2/18/03 75
Other Metadata • Description of depiction/surrogate (What VRA calls its "Surrogate Categories") • Description of original object • Rights and Reproduction Information • Location Information • VRA Core, LCSH, TGM, AAT, ULAN, TGN, DOI, <indecs>, . . . Besser--Missouri Digitization 2/18/03 76
Longevity & Preservation Repositories Besser--Missouri Digitization 2/18/03 77
Digital Preservation The Problem Preservation Repositories Preservation Metadata Other Digital Preservation Activities Special concerns of Cult Heritage community Besser--Missouri Digitization 2/18/03 78
Serious Longevity Problems What we know from prior widespread digital file formats Previous formats required little ongoing intervention (remote storage facilities, Iron Mtn); digital formats require intense ongoing management The Short Life of Digital Info- Besser--Missouri Digitization 2/18/03 79
The Short Life of Digital Info: Digital Longevity Problems Disappearing Information The Viewing Problem The Scrambling Problem The Inter-relation Problem The Custodial Problem The Translation Problem Besser--Missouri Digitization 2/18/03 80
The Viewing Problem Digital Info requires a whole infrastructure to view it Each piece of that infrastructure is changing at an incredibly rapid rate How can we ever hope to deal with all the permutations and combinations Besser--Missouri Digitization 2/18/03 81
The Scrambling Problem Dangers from: Compression to ease storage & delivery Container Architecture to enhance digital commerce Besser--Missouri Digitization 2/18/03 82
The Inter-relation Problem -Info is increasingly inter-related to other info -How do we make our own Info persist when it points to and integrates with Info owned by others? -What is the boundary of a set of information (or even of a digital object)? Besser--Missouri Digitization 2/18/03 83
The Custodial Problem In the past, much of survival was due to redundancy How do we decide what to save? Who should save it? Mellon-funded E-Journal Archives How should they save it? - Besser--Missouri Digitization 2/18/03 84
The Custodial Problem: How to save information? Methods for later access Refreshing Migration Emulation Issues of authenticity and evidence Besser--Missouri Digitization 2/18/03 85
The Translation Problem Content translated into new delivery devices changes meaning – – – -A photo vs. a painting -If Info is produced originally in digital form in one encoded format, will it be the same when translated into another format? Behaviors Besser--Missouri Digitization 2/18/03 86
Older Longevity Projects http: //sunsite. berkeley. edu/Longevity/ CPA Task Force Getty “Time & Bits” Conference & Follow-ups Preservation experiments in US and Europe NEDLIB, CURL, Michigan Internet Archive Long Now Besser--Missouri Digitization 2/18/03 87
Preservation Repositories: Projects based on OAIS Model CEDARS NEDLIB Pandora CDL OCLC/RLG Working Group on Preservation Metadata, Attributes of a Trusted Digital Repository, August 2001 - Besser--Missouri Digitization 2/18/03 88
Preservation Metadata OCLC/RLG Working Group on Preservation Metadata, Preservation Metadata for Digital Objects: A Review of the State of the Art, January 31 2001 OCLC/RLG Working Group on Preservation Metadata, A Recommendation for Content Information, October 2001 Besser--Missouri Digitization 2/18/03 89
Preservation Repositories: Open Archival Info System Model Consumer Producer Management Besser--Missouri Digitization 2/18/03 90
Preservation Repositories: Open Archival Info System Model High-level reference model describing submission, organization and management, and continuing access Conceptual framework for different organizations to share discussions with a common language Producers, consumers, management, actual repository SIP, DIP, AIP consists of data objects plus representation info (Content, Preservation Description, Packaging, Descriptive) Originally developed for Space Science community Besser--Missouri Digitization 2/18/03 91
Preservation Repositories -- AIP Metadata • Preservation Description Info – reference info – context info – provenance info – fixity info • Packaging Info • Descriptive Info • Content Info Besser--Missouri Digitization 2/18/03 92
OCLC/RLG Digital Repository Attributes • Administrative responsibility • Organizational viability • Financial sustainability • Technological suitability • System security • Procedural accountability Besser--Missouri Digitization 2/18/03 93
OCLC/RLG Selected Recommendations • Policies, Certification processes, Risk • • • management, Persistent ID, Migration/Emulation experiments Stakeholders meet to decide how to describe what is in a dig repository Examine special properties of particular classes of digital objects Technical standards for exchange and interoperability btwn repositories Develop projects and case studies Copyright issues Besser--Missouri Digitization 2/18/03 94
Other Digital Preservation Activities LC Natl Dig Info Infrastructure & Preservation Inter. PARES Emulation Projects E-Journal Archiving ERPANET Persistent Naming Besser--Missouri Digitization 2/18/03 95
LC’s National Digital Information Infrastructure and Preservation Program • Authorized Dec 2000 • LC, Dept of Commerce, NARA, White House Office of Sci & Tech Policy • with help from CLIR, NLM, NAL, OCLC, RLG • Ongoing collab process • Commissioned papers on preserving: the Web, periodicals, digital sound, E-Books, Digital TV, Digital Video Besser--Missouri Digitization 2/18/03 96
Inter. PARES International Research on Permanent Authentication Records in Electronic Systems • Ongoing international archival world project examining how to make electronically-generated records last over time • Developing theoretical and methodological knowledge needed, then will formulate model policies, strategies, and standards • In 2003 was extended to include images and rich media Besser--Missouri Digitization 2/18/03 97
Emulation Projects • CAMi. LEON (Michigan/Leeds) • NEDLIB Besser--Missouri Digitization 2/18/03 98
E-Journal Archiving • Issues – – – License, don’t own; may not be even able to obtain right to make archival copy Increasingly no paper back-up at all Usually we don’t have the important redundancy factor • Mellon funded projects (2001) – – Yale, Harvard, Penn working w/individual publishers Cornell, NYPL--specific disciplines MIT exploring characteristics that change (dynamic) Stanford--archiving software tools Besser--Missouri Digitization 2/18/03 99
Electronic Resource Preservation and Access NETwork (ERPANET) • Best practices and skills development for digital preservation of cultural heritage and scientific objects • 3 year project launched Nov 2001; 1. 2 million Euros Besser--Missouri Digitization 2/18/03 100
Persistent Naming URNs Handles PURLs Re-directs Besser--Missouri Digitization 2/18/03 101
Other Elements • Actors Metadata • Other Metadata • Preserving Electronic Art Besser--Missouri Digitization 2/18/03 102
http: //www. delos-nsf. actorswg. cdlib. org/ DELOS/NSF Working Group Reference Models for Digital Libraries: Actors and Roles Besser--Missouri Digitization 2/18/03 103
NSF/DELOS Actors/Roles Project • Classes of Actors, including – – – Persons Organizations automata • Roles & implications – – Production Dissemination Management use Besser--Missouri Digitization 2/18/03 104
Multimedia & Collaborative Authorship imply • Not only: – – – Authors Editors Publishers • But also creators of – – Text Illustrations Composers Musicians. . . Besser--Missouri Digitization 2/18/03 105
And goes beyond conventional authors • Others that are part of digital library process – – – Users Catalogers Reference librarians • Even other groups/entities – – – Software agents Mediators Special rights holders. . . Besser--Missouri Digitization 2/18/03 106
Digital Library Borbinha’s “naive tentative sketch” of the problem. . . Publication Licensing Acquisition Dissemination Registration Search Agent Creator Distributor Editor Access User Librarian Registered Anonymous Preservation
Benefits for • Linking metadata to authority records • Rights management • Privacy protection Besser--Missouri Digitization 2/18/03 108
Deliverables • Workshop proceedings: proceedings with invited contributions and papers selected from a call, intended to be a reference source for the current state of the art. • White paper: – Definition and introduction to the problem. – Description and analysis of the requirements. – A proposal to the community for a reference model, focusing on definitions of key concepts, terminology, classes of agents, services, relationships, etc. – Proposals for an international agenda for further technical and collaborative developments. Besser--Missouri Digitization 2/18/03 109
Core group DELOS (Europe) NSF (USA) • José Borbinha, National Library of • John Kunze, University of • • • Portugal (DELOS coordinator) Michel Mabe, Elsevier Science, UK (Publishing industry) Peter Mutschke, Social Science Information Centre, Germany (Software agents, Information Retrieval) Hans-Jörg Lieder, Berlin State Library, Germany (LEAF project) Gunnar Karlsen, University of Bergen, Norway (Archives) WIPO – World Intellectual Property Organisation • Glenn Macstravic Besser--Missouri Digitization 2/18/03 • • • California, USA (NSF coordinator) Barbara Tillett, Library of Congress, USA (Libraries) Becky Dean, OCLC, USA (Libraries services) Angela Spinazze, CIMI/RLG, USA (Museums) Howard Besser, University of California, USA (Multimedia and digital art production) DCMI - Dublin Core Metadata Initiative • Warwick Cathro, National Library of Australia 110
Work plan Phase 1: Starting (March - April 2002) • Tuning objectives, scope, and action plan • Identification of reference sources • Call for contributions to the workshop Phase 2: Internal Discussion (May - June 2002) • Analysis of the problem • Draft paper Phase 3: Public Discussion (July - October 2002) • Expose the draft paper. Promote open public discussion • Workshop in Portugal (July 3 -5). Workshop report • Draft paper (second version) Phase 4: Conclusions (November - December 2002) • Review of the work done. . . • Besser--Missouri Final report Digitization 2/18/03 111
. . . Actors and Roles ? ? ? Besser--Missouri Digitization 2/18/03 112
Data Structures: The VRA Core 28 elements specifically for visual resource collections Work Description Categories Visual Document Description Categories http: //www. oberlin. edu/~art/vra/dsc. html Besser--Missouri Digitization 2/18/03 113
VRA Core: Work Description Categories Work type • Repository number Title • Current site Measurements • Original site Material • Style/period/group/m Technique Creator • Role • Date • Repository name • Repository place • Besser--Missouri Digitization 2/18/03 ovement Nationality/culture Subject Related work Relationship type Notes 114
VRA Core: Visual Document Description Categories Visual document type Visual document format Visual document measurements Visual document date Visual document owner number Visual document view description Visual document subject Visual document source Besser--Missouri Digitization 2/18/03 115
Data Value Metadata (vocabularies) LCSH TGM AAT ULAN TGN VRA Core Besser--Missouri Digitization 2/18/03 116
LCSH very general Besser--Missouri Digitization 2/18/03 117
Thesaurus for Graphic Materials designed for subject indexing of pictorial materials, particularly large general collections of historical images for cataloging and retrieval good for general audiences and broad approaches to the material TGM-I: Subject Terms & TGM-II: Genre and Physical Characteristic Terms http: //lcweb. loc. gov/rr/print/tgm/toc. html Besser--Missouri Digitization 2/18/03 118
AAT 120, 000 terms for describing objects, textual materials, images, architecture, and material culture from antiquity to present large and complex http: //www. getty. edu/gri/vocabularies/ Besser--Missouri Digitization 2/18/03 119
ULAN name authority http: //www. getty. edu/gri/vocabularies/ Besser--Missouri Digitization 2/18/03 120
Thesaurus of Geographic Names over 1 million records hierarchical and global throughout history most records include coordinates and descriptive notes Besser--Missouri Digitization 2/18/03 121
Metadata for Digital Commerce DOI <indecs>- Besser--Missouri Digitization 2/18/03 122
<Indecs> formal structure for describing and uniquely identifying intellectual property itself, the people and businesses involved in its trading, and the agreements which they make about it (primarily for publishing, music, and visual arts) will develop high-level specifications for the services that will be required to implement a global IP trading system based on this <indecs> generic data model focus is on encoding rights at a high level, not on resource discovery likely to involve metadata schma registration and directory to allow interoperation of personal identifiers for rightsholders and users supported by EEC DG-13 First meeting July 1999 http: //www. indecs. org/ Besser--Missouri Digitization 2/18/03 123
What’s special about Cult Heritage Materials? • Images & rich media • Inter-relationships btwn parts • For Contemporary Art: What is the Work? - Besser--Missouri Digitization 2/18/03 124
Le. Witt: Wall Drawing 340 Besser--Missouri Digitization 2/18/03 125
Installing Le. Witt Besser--Missouri Digitization 2/18/03 126
Le. Witt Install Directions Besser--Missouri Digitization 2/18/03 127
Complexity of Rich Media • Works often have artistic nature (including video games) • Enormous number of elements can, at times, be very important to preserve (pacing, original artifact, elements used to construct the artifact) • Too complex to save every one of these aspects for every type of material • Importance of saving documentation Besser--Missouri Digitization 2/18/03 128
What can we do specific to Electronic Art? • • Works themselves may no longer even exist; in many cases, what we can save amounts to forensic evidence Enormous number of elements can, at times, be very important to preserve (pacing, original artifact, elements used to construct the artifact) Too complex to save every one of these aspects for every type of material Importance of saving pieces, representations, and documentation Involve the artists to capture their intentions Importance of Standards Familiarize ourselves with recent conservation developments (Who Knows? , Tech. Archeology, Tate, IMAP) Besser--Missouri Digitization 2/18/03 129
Standards for encoding artists intentions (group efforts w/i Cult Heritage community) • Artists Interviews Project, Netherlands • • Institute for Cultural Heritage 1998 -1999, Modern Art: Who Cares (http: //www. icn. nl/english/6. 4. 2. html) Tech. Archeology: A Symposium on Installation Preservation (SFMOMA) More recent SFMOMA/Tate collaborations IMAP Guggenheim’s Variable Media Besser--Missouri Digitization 2/18/03 130
Structural Metadata Standards for Encoding Multimedia(no time for details) • SMIL • MPEG 4 Besser--Missouri Digitization 2/18/03 131
A few questions our community should address • • Special issues raised by non-library institutions Special issues raised by images and rich media What is the work (or salient points we need to preserve)? Bring the arts communities (artist intent, BAVC) together with the preservation repository communities and the preservation metadata communities Specifically get Cult Heritage communities involved with the selected OCLC/RLG recommendations Get cult heritage groups started on working to make sure that structure standards incorporate our works What organizations will take responsibility to save today’s digital “ephemeral” materials (online ‘zines, arts discussion groups, etc. )? Besser--Missouri Digitization 2/18/03 132
Digital Repository Traditions & Services require Sustainability Interoperability Access And all of these require Standards and Metadata Besser--Missouri Digitization 2/18/03 133
Building a Digital Future: Sustainable, Interoperable, Accessible Repositories Howard Besser, NYU Archiving & Preservation Program Bernie Hurley, UC Berkeley Library • • • http: //www. firstmonday. dk/issues/issue 7_6/besser/ Baca, Murtha (ed). Introduction to Metadata, Los Angeles: Getty Information Institute, 1998 http: //www. getty. edu/gri/standard/intrometadata/ http: //www. gseis. ucla. edu/~howard/Metadata/UC-May 00/ http: //sunsite. berkeley. edu/Metadata/sp 2000. html http: //sunsite. berkeley. edu/Longevity/ http: //www. oclc. org/digitalpreservation/presmeta_wp. pdf http: //is. gseis. ucla. edu/us-interpares/ http: //www. niso. org/commitau. html http: //www. ifla. org/II/metadata. htm METS official site: http: //www. loc. gov/standards/mets UC Libraries Systemwide Operations and Planning Advisory Group (SOPAG) Site http: //www. slp. ucop. edu/sopag/ for the UC Digital Preservation & Archiving Committee Final Report, the Access Integration Model white paper and the Library Services Privacy report
- Slides: 134