Metadata Applications Marcia Lei Zeng NSDL All Project

  • Slides: 33
Download presentation
Metadata Applications Marcia Lei Zeng NSDL All Project Meeting October, 2003

Metadata Applications Marcia Lei Zeng NSDL All Project Meeting October, 2003

Outline 1. Many for one: available metadata standards 2. Different types of metadata and

Outline 1. Many for one: available metadata standards 2. Different types of metadata and their functions 3. Post-MARC metadata principles 4. Working with your collections 5. Using controlled vocabularies

1. Many for One: Available Metadata Standards What is the definition of metadata? w

1. Many for One: Available Metadata Standards What is the definition of metadata? w Metadata are structured, encoded data that describe characteristics of informationbearing entities to aid in the identification, discovery, assessment, and management of the described entities

Existing Metadata Standards (1) (Check here for updated list and URLs http: //www. slis.

Existing Metadata Standards (1) (Check here for updated list and URLs http: //www. slis. kent. edu/~mzeng/metadatalist. htm) Bibliographic Description (general) w MARC (Machine Readable Cataloging) – MODS (Metadata Object Description Schema) – MARC XML w w w Dublin Core Element Set GILS (Government Information Locator Service) RFC 1807 (Format for Bibliographic Records) TEI Headers (Text Encoding Initiative) MCF (Meta Content Format) PICS (Platform for Internet Content Selection)

Existing Metadata Standards (2) Images and Objects w Categories for the Description of Works

Existing Metadata Standards (2) Images and Objects w Categories for the Description of Works of Art (CDWA) w VRA (Visual Resource Association) Core Categories w MESL (Museum Education Site Licensing Project) Data Dictionary w Object ID w Guide to the Description of Architectural Drawings (FDA Guide) w NISO Data Dictionary for Technical Metadata for Digital Still Images Geospatial Data w Content Standards for Digital Geospatial Metadata (CSDGM) Archive w EAD (Encoded Archival Description) DTD w Recordkeeping Metadata Standard for Commonwealth Agencies (1999)

Existing Metadata Standards (3) Right Management w Rights Metadata w DOI -- Digital Object

Existing Metadata Standards (3) Right Management w Rights Metadata w DOI -- Digital Object Identifier Educational-purpose w Instructional management Systems (IMS) w The Gateway to Educational Materials (GEM) Schema w DC Education Schema (DC-ED) w IEEE Learning Objects Metadata (LOM) Preservation of digital objects Preservation w CEDARS Project: CEDARS Preservation Metadata Elements w National Library of Australia. Preservation Metadata for Digital Collections: Exposure Draft w Networked European Deposit Library. Metadata for Long Term Preservation Other specialized standards w Music, media, broadcasting, etc.

Overlapping Metadata Standards w There is no limit for the type or amount of

Overlapping Metadata Standards w There is no limit for the type or amount of resources that can be described by metadata. w There is no limit for the number of overlapping metadata standards for any type of resources or any subject domain. w There is no limit for the types of profession or subject domain that would be involved in metadata standard development and application.

2. Different Types of Metadata and Their Functions w Administrative w Descriptive w Preservation

2. Different Types of Metadata and Their Functions w Administrative w Descriptive w Preservation w Technical w Use Source: Murtha Baca ed. : Introduction to Metadata, Pathway to Digital Information. Getty Information Institute. Table 1.

Administrative Metadata -- Metadata used in managing and administering information resources - Acquisition information

Administrative Metadata -- Metadata used in managing and administering information resources - Acquisition information - Rights and reproduction tracking - Documentation of legal access requirements - Location information - Selection criteria for digitization - Version control and differentiation between similar information objects - Audit trails created by recordkeeping systems

Descriptive Metadata --Metadata used to describe or identify information resources - Cataloging records -

Descriptive Metadata --Metadata used to describe or identify information resources - Cataloging records - Finding aids - Specialized indexes - Hyperlinked relationships between resources - Annotations by users - Metadata for recordkeeping systems generated by records creators

Preservation Metadata related to the preservation management of information resources - Documentation of physical

Preservation Metadata related to the preservation management of information resources - Documentation of physical condition of resources - Documentation of actions taken to preserve physical and digital versions of resources, e. g. , data refreshing and migration

Technical Metadata -- Metadata related to how a system functions or metadata behave, for

Technical Metadata -- Metadata related to how a system functions or metadata behave, for example: – Hardware and software documentation – Digitization information, e. g. , formats, compression ratios, scaling routines – Tracking of system response times – Authentication and security data, e. g. , encryption keys, passwords

Use Metadata -- Metadata related to the level and type of use of information

Use Metadata -- Metadata related to the level and type of use of information resources - Use and user tracking - Exhibit records - Content re-use and multi-versioning information

3. Post-MARC Metadata Principles w w w Simplicity Modularity Reusability Extensibility Interoperability Administrativ e

3. Post-MARC Metadata Principles w w w Simplicity Modularity Reusability Extensibility Interoperability Administrativ e metadata Technical metadata Descriptive metadata Use metadata Preservation metadata

4. Working with your collections --Knowing the difference w “Object"/"work" vs. reproduction w Textual

4. Working with your collections --Knowing the difference w “Object"/"work" vs. reproduction w Textual vs. non-textual resources w Document-like vs. non-document-like objects w Collection-level vs. item-level

“Credits: Photographs: Various photographers, mostly William Ward Watkin. ” The Construction of the Administration

“Credits: Photographs: Various photographers, mostly William Ward Watkin. ” The Construction of the Administration Building http: //www. rice. edu/fondren/woodson/exhibits/Watkin/adminconstruction. html

How to describe …? w Describe what? w The image itself? Or w The

How to describe …? w Describe what? w The image itself? Or w The building? w The building as a building? Or w A building which has a historical importance?

Work vs. Image w A work is a physical entity that exists, has existed

Work vs. Image w A work is a physical entity that exists, has existed at some time in the past, or that could exist in the future. w An image is a visual representation of a work. It can exist in photomechanical, photographic and digital formats.

Work vs. Image: an example w Data sets describing a chair that was documented

Work vs. Image: an example w Data sets describing a chair that was documented by a photograph. The photograph was later copied to a slide format and scanned to create a digital image. w Frederick C. Robie House dining chair w Designer: Wright, Frank L. (1867 -1959) w See VRA Example 3. http: //www. vraweb. org/vracore 3. htm#compendi um

Work vs. Image w A digital collection needs to decide what is the entity

Work vs. Image w A digital collection needs to decide what is the entity of their collection: – – works, images, or both? How many metadata records are needed for each entity? w Some part of the data can be reused. – E. g. , one work has different images or different formats

Revisiting Dublin Core Content Intellectual Property Instantiation Coverage Contributor Date Description Creator Format Type

Revisiting Dublin Core Content Intellectual Property Instantiation Coverage Contributor Date Description Creator Format Type Publisher Identifier Relation Rights Language Source Subject Title If one work has different reproduction …

w Text: Textual vs. Non-textual – Would allow for full text searching or automatic

w Text: Textual vs. Non-textual – Would allow for full text searching or automatic extraction of keywords. – Marked by HTML or XML tags. – Tags have semantic meanings. w Non-textual, e. g. , images: – Only the captions, file names can be searched, not the image itself. – Need transcribing or interpreting. – Need more detailed metadata to describe its contents. – Need knowledge to give a deeper interpretation. Newspaper dated July 16, 1976, reporting the initial discovery of burials in Granado Cave.

Document-like vs. non-document-like Each object usually has the following characteristics: · being in three

Document-like vs. non-document-like Each object usually has the following characteristics: · being in three dimensions, · having multiple components · carrying information about history, culture, and society, and · demonstrating in detail about style, pattern, material, color, technique, etc.

Collection-level vs. item-level w Collection level w Item level w Relation w Is Version

Collection-level vs. item-level w Collection level w Item level w Relation w Is Version Of Has Version Is Replaced By Replaces Is Required By Requires Is Part Of Has Part Is Referenced By References Is Format Of Has Format Conforms To

Collection example Dorothea Lange's "Migrant Mother" Photographs in the Farm Security Administration Collection http:

Collection example Dorothea Lange's "Migrant Mother" Photographs in the Farm Security Administration Collection http: //www. loc. gov/rr/print/128_migm. html ( next slide)

5. Using Controlled Vocabularies in Metadata Records (Check here for the updated list: http:

5. Using Controlled Vocabularies in Metadata Records (Check here for the updated list: http: //www. slis. kent. edu/~mzeng/metadata/thesaurilist. htm) w Content data for some elements may be selected from a controlled vocabulary: 1. Established vocabularies • • Controlled Vocabularies and Classification Schemes Standardized vocabularies 2. Name authority files 3. Controlled terms

Revisiting Dublin Core Content Intellectual Property Instantiation Coverage Contributor Date Description Creator Format Type

Revisiting Dublin Core Content Intellectual Property Instantiation Coverage Contributor Date Description Creator Format Type Publisher Identifier Relation Rights Language Source Subject Title Content data for some elements may be selected from a controlled vocabulary …

Established Controlled Vocabularies and Classification Schemes Usually recommended by the metadata best practice guidelines:

Established Controlled Vocabularies and Classification Schemes Usually recommended by the metadata best practice guidelines: w Subject Headings – LC Subject Headings (LCSH) – Medical Subject Headings (MESH) w Thesauri – Art and Architecture Thesaurus (AAT) w Classification schemes – Dewey Decimal Classification (DDC)

Standardized vocabularies – Type DCMI Type Vocabulary – Format Internet Media Types [MIME] –

Standardized vocabularies – Type DCMI Type Vocabulary – Format Internet Media Types [MIME] – Language RFC 3066 [RFC 3066] in conjunction with ISO 639 Codes for the representation of names of languages – Countries ISO 3166 - Codes for the representation of names of countries

Name authority control – The Union List of Artist Names (ULAN) , Getty –

Name authority control – The Union List of Artist Names (ULAN) , Getty – Thesaurus of Geographic Names (TGN), Getty – LC Name Authority file = Anglo-American Authority File (AAAF) and – local name authority files

Controlled terms w Dictionaries and indexes – Compile a list of suggested terms based

Controlled terms w Dictionaries and indexes – Compile a list of suggested terms based on dictionaries and indexes w “Folk” controlled lists • DC-ED: audience, pedagogy • GEM Controlled Vocabularies: – Audience | Format | Grade | Language | Pedagogy | Relation | Resource Type | Subject – http: //www. geminfo. org/Workbench_vocab ularies. html

Putting things together w Internal work: – Standards, including formats and vocabularies – A

Putting things together w Internal work: – Standards, including formats and vocabularies – A metadata input tool – Storage: text files and databases w User interface: – Browsing and searching interfaces • The materials as well as the surrogates • How to organize all the materials (by type, date, subject …) – Integrating with different systems