Stanford Digital Repository PREMIS Geospatial Resources Nancy J

  • Slides: 28
Download presentation
Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Info. Analytics San Mateo,

Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Info. Analytics San Mateo, CA ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 1

To Be Discussed A Brief History of PREMIS n An Overview of PREMIS data

To Be Discussed A Brief History of PREMIS n An Overview of PREMIS data elements n Uses for Geospatial Resources: Examples n ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 2

A Brief History of PREMIS n n PREMIS – Preservation Metadata came initially from

A Brief History of PREMIS n n PREMIS – Preservation Metadata came initially from cultural heritage / digital preservation communities Built upon previous initiative (2001 - 02 ) ¨ Sponsored by two key library descriptive MD utilities (OCLC and RLG) ¨ Preservation Metadata Frameworking group ¨ Issued a report outlining types of information that should be associated with an archived digital object ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 3

A Brief History of PREMIS n n In 2003 a PREMIS working group formed

A Brief History of PREMIS n n In 2003 a PREMIS working group formed Comprised of practitioners building or working on preservation repositories including national data centers in the UK & US, Netherlands, etc. Focused upon implementable data elements Resulted in a two pronged effort: ¨ Implementation survey ¨ Data dictionary of CORE preservation semantic units (= data elements) ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 4

A Brief History of PREMIS n PREMIS working group publications: ¨ “Implementing Preservation Repositories

A Brief History of PREMIS n PREMIS working group publications: ¨ “Implementing Preservation Repositories for Digital Materials: Current Practice and Emerging Trends in the Cultural Heritage Community”, December 2004 ¨ “PREMIS Data Dictionary for Preservation Metadata, version 1. 0”, May 2005 ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 5

A Brief History of PREMIS n PREMIS Implementation ¨ PREMIS Editorial committee formed ¨

A Brief History of PREMIS n PREMIS Implementation ¨ PREMIS Editorial committee formed ¨ Maintained by Library of Congress ¨ “PREMIS Data Dictionary for Preservation Metadata, version 2. 0”, March 2008 ¨ Who uses? See implementation registry ¨ PREMIS Implementors Group (PIG) listserv for practitioners ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 6

PREMIS Data Model for an “intellectual entity” OBJECT n RIGHTS n EVENTS AGENTS n

PREMIS Data Model for an “intellectual entity” OBJECT n RIGHTS n EVENTS AGENTS n n Discrete unit of information in digital form Rights or permissions info associated with Object or Agent Important lifecycle events Parties to Events and/or Rights ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 7

PREMIS Data Model ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 –

PREMIS Data Model ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 8

More about PREMIS Object n n Is an abstraction, meant to cluster semantic units

More about PREMIS Object n n Is an abstraction, meant to cluster semantic units and clarify relationships Has 3 subtypes: ¨ File – the usual suspect ¨ Bitstream – contiguous or non-contiguous data within a file that has meaningful common properties for preservation purposes ¨ Representation -- set of files, including structural metadata, needed for a complete and reasonable rendition of an Intellectual Entity. ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 9

Assumptions underlying PREMIS n n Not about “descriptive” metadata (used for search & discovery)

Assumptions underlying PREMIS n n Not about “descriptive” metadata (used for search & discovery) Not about “technical” metadata (usually about the format(s) of the component files or bitstreams) These areas to be covered by domain specific metadata, e. g. , FGDC or ISO profiles Mind the Gap! ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 10

Simple Example of use of PREMIS Object Data Elements ¨ Applied at file level

Simple Example of use of PREMIS Object Data Elements ¨ Applied at file level ¨ Automatic insertion by Ingest code to retain important provenance info for each file before moving into the preservation repository Original file name from data provider n Original checksum n Original file size n ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 11

PREMIS Object Excerpt (v 1. 1) ESIP 2009 Summer Meeting, UC Santa Barbara, CA,

PREMIS Object Excerpt (v 1. 1) ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 12

More about PREMIS Object relationships n Defined as associations b/w two or more: Object

More about PREMIS Object relationships n Defined as associations b/w two or more: Object entities or ¨ Entities of different types, e. g. , an Object & an Agent. ¨ n n n Recorded for long term preservation purposes Typical relationship types = structural (component of representation), derivative (format varieties), dependent (required schema or database structure) Could be expressed using other schemas for packaging the resource such as METS or XFDU or MPEG DIDL ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 13

Use of PREMIS Rights data elements n n n Applied at representation level Reference

Use of PREMIS Rights data elements n n n Applied at representation level Reference to donor’s Deposit Agreement (using METS) Key info from the ingested Deposit Agreement for immediate playback ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 14

PREMIS Rights Excerpt (v 1. 1) ESIP 2009 Summer Meeting, UC Santa Barbara, CA,

PREMIS Rights Excerpt (v 1. 1) ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 15

Use of PREMIS Event for simple event n Why this event? n Event 1:

Use of PREMIS Event for simple event n Why this event? n Event 1: ¨ Transform ¨ In of descriptive MD from MS Access db => XML => MODS ¨ Applied at representation level case of questions from outside data provider ¨ Retain singular scripts & transform mechanisms ¨ Test practicability of recording such events in production environment ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 16

PREMIS Event Excerpt (v 1. 1) ESIP 2009 Summer Meeting, UC Santa Barbara, CA,

PREMIS Event Excerpt (v 1. 1) ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 17

Another example: GIS Dataset: Street network of given metropolitan area Dataset 1: official street

Another example: GIS Dataset: Street network of given metropolitan area Dataset 1: official street centerline file used by emergency services to locate street addresses n Dataset 2: aspects of the road network including topography, angles & geometry of the road network used for a tourist map Event to be documented: Merge c: tempstates 1; c: temp states 2; c: tempUSA ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 18

Use of PREMIS Event Data Elements n Why this event? to describe processes during

Use of PREMIS Event Data Elements n Why this event? to describe processes during different phases of lifecycle, even prior to ingestion n ¨ Important Want to describe full process of data creation ¨ Includes “merge” and data sources n Advantage of PREMIS – can describe events once in repository ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 19

Use of PREMIS Agent Data Elements n n n For data management Version of

Use of PREMIS Agent Data Elements n n n For data management Version of Ingest code? within the repository Data provider who n Audit trail for created / altered the descriptive MD resource or the metadata, e. g. , USGS which added FGDC MD to HRO from Monterey Bay Water Resource ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 20

PREMIS & Geospatial data -Comments based on experiences: n Works well when: ¨ Domain

PREMIS & Geospatial data -Comments based on experiences: n Works well when: ¨ Domain specific MD exists, e. g. , FGDC for descriptive and technical MD ¨ There are levels of the resource with MD to be associated, e. g. , at representation & file(s) level ¨ Need to document various points in the lifecycle of the data ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 21

PREMIS & Geospatial data -Comments based on experiences: n In earlier versions of PREMIS

PREMIS & Geospatial data -Comments based on experiences: n In earlier versions of PREMIS unclear how to document: ¨ Context ¨ Environment including at time of creation ¨ “Significant properties” ¨ Existence of geospatial format registries ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 22

PREMIS v 2. 0 more flexible n Still XML binding ¨ Allows for containers

PREMIS v 2. 0 more flexible n Still XML binding ¨ Allows for containers ¨ Allows hierarchical relationships ¨ Extensible by use of new <premis: extension> element to insert other elements, XML fragments, e. g. , technical MD, provenance metadata, etc. n Board considering the inclusion of mechanism used by packaging schemas to “wrap” or “reference” other metadata ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 23

PREMIS & Complex Geospatial Data n For more detail, see “An Investigation into Archiving

PREMIS & Complex Geospatial Data n For more detail, see “An Investigation into Archiving Geospatial data Formats “ prepared for NGDA Project, funded by NDIIPP (http: //www. ngda. org/research. php) Formats examined ¨ Approaches of FGDC, PREMIS, and Center for International Earth Science Information Network (CIESIN)‘s Geospatial Electronic Record (GER) model on basis of: ¨ n n n Environment/ computer platform Semantic underpinnings domain specific terminology provenance data quality appropriate use ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 24

Examples of Geospatial “Context” Placing dataset in Time & Space ¨ Semantic underpinnings, e.

Examples of Geospatial “Context” Placing dataset in Time & Space ¨ Semantic underpinnings, e. g. , § Abstract § Description of purpose / research methodology § Intended use of data to avoid misinterpretation or misuse ¨ Where to put? § FGDC has place § PREMIS would not necessarily consider this as “preservation” metadata, but rather “descriptive” or technical MD, however see v 2. 0 ¨ ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 25

Examples of “Environment” and/or “Significant properties” for geospatial data ¨ HW info pertinent at

Examples of “Environment” and/or “Significant properties” for geospatial data ¨ HW info pertinent at time of data creation ¨ SW info pertinent at time of data creation (? ) ¨ Lineage or “provenance” data e. g. , to communicate processing steps used to create scientific data product ¨ Events, parameters & source data which influenced or impacted the creation of the data set prior to its ingestion into the archive in order to full understand the data that you’re getting ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 26

“Environment” & “Significant properties”, continued… § Data Quality – describing completeness, logical consistency, attribute

“Environment” & “Significant properties”, continued… § Data Quality – describing completeness, logical consistency, attribute accuracy § Data Trustworthiness – data creator / provider reliable? = “authentic” § Data Provenance – processes & sources for dataset = “understandable & reliable” § Understanding of the specific needs of the “designated community” ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 27

Questions? / comments? Nancy J. Hoebelheinrich njhoebel@gmail. com ESIP 2009 Summer Meeting, UC Santa

Questions? / comments? Nancy J. Hoebelheinrich njhoebel@gmail. com ESIP 2009 Summer Meeting, UC Santa Barbara, CA, July 7 – 10, 2009 28