Stanford Digital Repository Extending the Implementation of PREMIS

  • Slides: 15
Download presentation
Stanford Digital Repository Extending the Implementation of PREMIS to Geospatial Resources in the Stanford

Stanford Digital Repository Extending the Implementation of PREMIS to Geospatial Resources in the Stanford Digital Repository: An Exploration By Nancy J. Hoebelheinrich Metadata Coordinator Digital Library Systems & Services PREMIS Tutorial San Diego, CA 11 Feb 2008 1

To Be Discussed n n Context for SDR (Stanford Digital Repository) What PREMIS data

To Be Discussed n n Context for SDR (Stanford Digital Repository) What PREMIS data elements are being used currently How & why PREMIS & Geospatial Resources - a fit? Implementing the PREMIS Data Dictionary at Stanford 2

SDR Context n n n Bit Level Preservation environment Designed to facilitate an automated

SDR Context n n n Bit Level Preservation environment Designed to facilitate an automated production environment for digitization & receipt of digital materials Use of a “target manifest” (TM) q q = core metadata, structure & file inventory expressed as METS document = SIP w/o content files Implementing the PREMIS Data Dictionary at Stanford 3

Scenario 1: David Rumsey Historical Maps Collection n n Comprised of historical maps digitized

Scenario 1: David Rumsey Historical Maps Collection n n Comprised of historical maps digitized as Single, Still image TIFFs METS Records for n n n Rumsey Deposit Agreement Rumsey “Collection Level” & Auxiliary Files Each Item Implementing the PREMIS Data Dictionary at Stanford 4

METS Documents for Rumsey Collection Relationships among METS Docs Deposit Agreement Individual Map Collection

METS Documents for Rumsey Collection Relationships among METS Docs Deposit Agreement Individual Map Collection Level / Auxiliary Files Implementing the PREMIS Data Dictionary at Stanford 5

PREMIS Records contained w/in METS Documents PREMIS OBJECT n PREMIS RIGHTS n PREMIS EVENTS

PREMIS Records contained w/in METS Documents PREMIS OBJECT n PREMIS RIGHTS n PREMIS EVENTS n Aspects of digital provenance Succinct link to full rights statement Important lifecycle events Implementing the PREMIS Data Dictionary at Stanford 6

Use of PREMIS Object Data Elements n Used in each METS Document referencing files

Use of PREMIS Object Data Elements n Used in each METS Document referencing files q n n Located in the METS <amd. Sec><tech. MD> section Automatic insertion by Ingest code to retain important provenance info for each file: q q q n Item, Agreement, “Collection Level” & Auxiliary Files Original file name from data provider Original checksum Original file size Some information redundant, but prefer to retain in case METS sections need to be pulled out separately for action q Rumsey Item TM Rumsey PREMIS_Object excerpt Implementing the PREMIS Data Dictionary at Stanford 7

Use of PREMIS Rights data elements n Rumsey Deposit Agreement TM n Represents the

Use of PREMIS Rights data elements n Rumsey Deposit Agreement TM n Represents the ingested draft Agreement with its own TM Placeholder for: n q q XML or other REL instance of full agreement or Use of METSRights once final agreement template is vetted & agreed upon by University Counsel Implementing the PREMIS Data Dictionary at Stanford 8

Use of PREMIS Rights data elements n n n How? <amd. Sec><rights. MD> <md.

Use of PREMIS Rights data elements n n n How? <amd. Sec><rights. MD> <md. Wrap><xml. Data> n n n Agreement TM Rumsey Rights Excerpt Why? Succinct summary of key information for quick access from METS Document itself Locator for more complete expression of terms, conditions; Implementing the PREMIS Data Dictionary at Stanford 9

Use of PREMIS Event Data Elements n Event 1: q q n Transform of

Use of PREMIS Event Data Elements n Event 1: q q n Transform of descriptive MD from MS Access db => XML => MODS Inserted into mets <amd. Sec><digiprov. MD> Rumsey Simple. File TM Rumsey Event Excerpt Why this event? q q q In case of questions from outside data provider Retain singular scripts & transform mechanisms Test practicability of recording such events in production environment Implementing the PREMIS Data Dictionary at Stanford 10

Scenario 2: Geospatial Files & PREMIS – is it a fit? n n Shapefiles

Scenario 2: Geospatial Files & PREMIS – is it a fit? n n Shapefiles Digital Raster Graphics (DRG) files Digital Ortho Quarter Quads (DOQQs) Factors: q n n Existence of extant domain specific MD, e. g. , FGDC for descriptive and technical MD Number of layers of the resource, e. g. , representation & file? Point in resource lifecycle wishing to document Implementing the PREMIS Data Dictionary at Stanford 11

Use of PREMIS Object Data Elements – Scenario 2: GIS Dataset n Domain specific

Use of PREMIS Object Data Elements – Scenario 2: GIS Dataset n Domain specific needs for Object: q q Context, especially for semantic underpinnings, e. g. , Abstract, description of purpose, intended use of data § No place for this in PREMIS(? ) § Perhaps <object><relationship> <related. Object. Identification> for an explanatory website? Environment § HW/ SW info pertinent at time of data creation (? ) “Significant properties” § Data Quality – describing completeness, logical consistency, attribute accuracy § Data Trustworthiness – data creator / provider reliable? = “authentic” § Data Provenance – processes & sources for dataset = “understandable” Better understanding of what’s contained in a “format registry” - & their existence! Implementing the PREMIS Data Dictionary at Stanford 12

Use of PREMIS Event Data Elements n Event : q q q n Would

Use of PREMIS Event Data Elements n Event : q q q n Would prefer the option to describe process of data creation Merge c: tempstates 1; c: temp states 2; c: tempUSA (includes process = “merge” and data sources Why this event? q q q Important to describe processes during different phases of lifecycle, even prior to ingestion Not to be able to do so – problemmatic for geospatial resources Advantage – can describe events once in repository, unlike FGDC Implementing the PREMIS Data Dictionary at Stanford 13

Issues & Challenges n n n Getting domain specific MD would help! If not,

Issues & Challenges n n n Getting domain specific MD would help! If not, getting important prez info from data creators -- uh huh!! How to determine what is truly necessary for dataset use? Is this level of documentation still bit preservation? Getting buy-in from domains, e. g. , geospatial Implementing the PREMIS Data Dictionary at Stanford 14

Questions? / comments? Nancy J. Hoebelheinrich nhoebel@stanford. edu Implementing the PREMIS Data Dictionary at

Questions? / comments? Nancy J. Hoebelheinrich nhoebel@stanford. edu Implementing the PREMIS Data Dictionary at Stanford 15