Digitizing Historical Records and Archives National Archives Conference

  • Slides: 47
Download presentation
Digitizing Historical Records and Archives National Archives Conference for Fraternities and Sororities July 17,

Digitizing Historical Records and Archives National Archives Conference for Fraternities and Sororities July 17, 2010 Urbana, Illinois Christopher J. Prom, Ph. D Assistant University Archivist and Associate Professor University of Illinois at Urbana-Champaign prom@illinois. edu

Session Messages • Don’t get hung up on perfection • Know where to look

Session Messages • Don’t get hung up on perfection • Know where to look • Use standards wisely

Session Overview • • • Project Planning, Funding and Promotion Description, Metadata and Access

Session Overview • • • Project Planning, Funding and Promotion Description, Metadata and Access Equipment and Software Workflow and Techniques— Your Questions: Throughout

PROVISOS • NOT Covered – Digital or Analog Preservation – Detailed Workflows, etc. •

PROVISOS • NOT Covered – Digital or Analog Preservation – Detailed Workflows, etc. • Based on: – Past and current projects I’ve managed – Other resources • Importance of discussion and best practices/resources

1. Project Management • Basic Steps (next 25 minutes): • Know your limits; start

1. Project Management • Basic Steps (next 25 minutes): • Know your limits; start small – Determine/negotiate purpose and scope – Develop the plan – Secure Funding (internal vs external) • Rest of Morning – Determine conversion, description and access mechanisms – Execute the plan • Outsource/work with vendors? • Complete work internally

Staying within limits • Self Assessment/Readiness – – Staffing, survey of skills (volunteers? )

Staying within limits • Self Assessment/Readiness – – Staffing, survey of skills (volunteers? ) Funding: are internal resources available Hardware, servers and storage (online, near line off line) Software • Digital capture • Cataloging • Preservation system – IP/Copyright (YIKES!) • http: //www. library. illinois. edu/archives/workpap/TOPTEN 2006. pd f – Time available

Technology Assessment • Current technology ‘stack’ and network – Hardware – Storage/redundancy. – Operating

Technology Assessment • Current technology ‘stack’ and network – Hardware – Storage/redundancy. – Operating systems (Windows, Linux, Mac – Web Servers (IIS, Apache, etc) – Databases (sql server, mssql, postgres) – Languages interpreters (php, . NET, python, etc. ) Get to know some IT lingo, it helps.

Self-Assessment and Training Resources • http: //webjunction. techatlas. org/tools/ --helps determine staff skills, where

Self-Assessment and Training Resources • http: //webjunction. techatlas. org/tools/ --helps determine staff skills, where training needed, understand current hardware, software available, etc. • http: //www. nedcc. org/resources/leaflets. list. php • http: //digitalwa. statelib. wa. gov/newsite/best. ht m • SAA web training: http: //saa. archivists. org/Scripts/4 Disapi. dll/4 DCG I/events/Conference. List. html? Action=Get. Events

Determine or Negotiate Purpose and Scope • Relationship to Archival Program/State of Analog Records

Determine or Negotiate Purpose and Scope • Relationship to Archival Program/State of Analog Records • Who is funding the work? • Why are you digitizing? – Preservation vs. access? ? ? • What are you going to digitize (selection method) • Who is target audience? • Expected Results?

Writing a Plan • Length • Elements – – – – Rationale/Overview Description of

Writing a Plan • Length • Elements – – – – Rationale/Overview Description of Work (get specific) Budget Personnel Advisors/Consultants Technology and Standards Outcomes Promotion/Dissemination • DRES Sample • Digitization on Demand Service also needs a plan

Secure Funding • Internal or local donors • Possible external possibilities – Partner with

Secure Funding • Internal or local donors • Possible external possibilities – Partner with and academic archives (deposit contract) – IMLS • LSTA grants, typically administered by State Library – NEH • Preservation Assistance Grants – NHPRC • regrant programs administered by state historic records advisory boards

Description, Metadata, and Access • Main messages – Based your digitization program on good

Description, Metadata, and Access • Main messages – Based your digitization program on good descriptive practice – Standards should facilitate descriptive work – Aim for a merged analog; digital archives system

Description • A Roadmap for using Digitized Resources

Description • A Roadmap for using Digitized Resources

Essential Information (‘Catalog’ Entry) • Collection title • Provenance (Creator) • Dates • Brief

Essential Information (‘Catalog’ Entry) • Collection title • Provenance (Creator) • Dates • Brief description • Finding Aid/Inventory

Core Descriptive Concepts • Levels of Description, Repository, “Collection” Series, Subseries, File, Item •

Core Descriptive Concepts • Levels of Description, Repository, “Collection” Series, Subseries, File, Item • Inheritance • Move from General to Specific • Authority Control • Same core elements avaialble at all levels

Core Descriptive Elements • Title: Collection – Name of record creator – Predominant type

Core Descriptive Elements • Title: Collection – Name of record creator – Predominant type of material (Records or Papers if mixed) – e. g James Tayson Personal Papers, Chapter Relations Subject File • Title: series, file, item – Type genre of material – Subject content • Dates

Optional Descriptive Elements • Name of creator or creating entity • Biographical sketch/organizational history

Optional Descriptive Elements • Name of creator or creating entity • Biographical sketch/organizational history • Records description/scope and content • Subject access points

Biographical Sketch/Organizational History • Overview of the main events in the history of the

Biographical Sketch/Organizational History • Overview of the main events in the history of the creator or creating organization • Provides family members and others with enough information to understand the context in which the records were created • New Standard: EAC-CPF

Overview of Descriptive Standards • ISAD(G) and DACS – Describing Archives: A Content Standard

Overview of Descriptive Standards • ISAD(G) and DACS – Describing Archives: A Content Standard • Encoded Archival Description – http: //www. loc. gov/ead/ • Encoded Archival Context – http: //eac. staatsbibliothek-berlin. de/ • Dublin Core – http: //dublincore. org/

Using these Data Standards “Although there is a profusion of metadata standards. . .

Using these Data Standards “Although there is a profusion of metadata standards. . . It is important not to be too concerned about these, nor about the relationships and differences between them. . . The important thing is to understand what the metadata needs to do and then apply it in the most effective way to help achieve the objectives. ” --Kate Cumming, “Metadata Matters, ” in Julie Mc. Leod and Catherine Hare, Managing Electronic Records • • • Use at highest level of control possible Use at aggregate levels before series, file, item level Use only what you can reliably support (assess resources) Use in way that is consistent and ‘migratable’ “Do no harm” Fitness to purpose

Software Overview 1 • Hybridity. Need one or more systems to handle – Description

Software Overview 1 • Hybridity. Need one or more systems to handle – Description of analog ‘stuff’ – Digitzed items – Born digital materials • Geek tool: XML editor: www. oxygenxml. com

Software Overview 2 • Description Systems, Digital Object Systems, and Linked Systems • Making

Software Overview 2 • Description Systems, Digital Object Systems, and Linked Systems • Making the choice • Proprietary options (Advantages and Disadvantages) – Past Perfect – http: //www. eloquent-systems. com/ – Content. DM – Digital Asset Management (see wikipedia)

Software Overview 3 • Open Source Systems – Archivists’ Toolkit (description only) – ICA

Software Overview 3 • Open Source Systems – Archivists’ Toolkit (description only) – ICA Ato. M (description and digital objects) – Archon (description and digital obects) – “Item-focused” digital object management • D-Space, Greenstone and other “repositories” – Exhibit-based software • Omeka – An evolving field

Resources to Evaluate Software • http: //erecords. chrisprom. com/? page_id=175 • http: //www. qsos.

Resources to Evaluate Software • http: //erecords. chrisprom. com/? page_id=175 • http: //www. qsos. org/? page_id=7 • http: //sosopensource. com

Archon Demonstration • http: //www. archon. org • http: //www. library. uiuc. edu/archives/

Archon Demonstration • http: //www. archon. org • http: //www. library. uiuc. edu/archives/

Description, Metadata, and Access Discussion

Description, Metadata, and Access Discussion

3. Conversion Process • Standards • Hardware • Software

3. Conversion Process • Standards • Hardware • Software

Conversion Standards • VERY complex area, depending on format being converted • Resources: –

Conversion Standards • VERY complex area, depending on format being converted • Resources: – http: //www. cyberdriveillinois. com/departments/li brary/who_we_are/bestpractices. html – http: //www. digitizationguidelines. gov • Many others; point is to make evidence-based decision; seriously consider outsourcing

Some General Thoughts • Images: – www. archives. gov/preservation/technical/guideli nes. pdf – “Best Practice”

Some General Thoughts • Images: – www. archives. gov/preservation/technical/guideli nes. pdf – “Best Practice” is 600 DPI, TIFF files, but these can be very bulky – Reasonable compromise is 300 DPI for photos, 1200 DPI for slides, jpeg high quality – Be careful regarding post-scanning conversions

Some more thoughts • Documents – http: //www. library. illinois. edu/archives/services/ digimage. php –

Some more thoughts • Documents – http: //www. library. illinois. edu/archives/services/ digimage. php – http: //hul. harvard. edu/rmo/policies_04 b. shtml – Best practices is as per above – But lower will work well – If preservation matters, scan as jpeg, otherwise straight to PDF/A

And a few more. . . • Audio – www. library. yale. edu/dpip/bestpractices/Media. Bes

And a few more. . . • Audio – www. library. yale. edu/dpip/bestpractices/Media. Bes t. Practices. doc – www. bcr. org/dps/cdp/best/digital-audio-bp. pdf • Sample rate of at least 44. 1 k. Hz, 24 bit depth • Wav or aif files (flac new emerging) • Assemble digital audio toolbox (good analog equip + A/D converter (or top quality sound card) + firewire + workstation/storage and software • http: //audio-editing-softwarereview. toptenreviews. com/

And in conclusion! • Film/Video – www. library. yale. edu/dpip/bestpractices/Media. Bes t. Practices. doc

And in conclusion! • Film/Video – www. library. yale. edu/dpip/bestpractices/Media. Bes t. Practices. doc – www. bcr. org/dps/cdp/best/digital-audio-bp. pdf • Sample rate of at least 44. 1 k. Hz, 24 bit depth • Wav or aif files (flac new emerging) • Assemble digital audio toolbox (good analog equip + A/D converter (or top quality sound card) + firewire + workstation/storage and software • http: //audio-editing-softwarereview. toptenreviews. com/

Recommended Outsourcing Options • • http: //www. scenesavers. com/ http: //themediapreserve. com/ http: //www.

Recommended Outsourcing Options • • http: //www. scenesavers. com/ http: //themediapreserve. com/ http: //www. safesoundarchive. com/ http: //www. normicro. com/ (docs, books, photos)

Hardware • Storage – Local hard drive and backup , e. g. RAID –

Hardware • Storage – Local hard drive and backup , e. g. RAID – Local area network – Network attached storage – Online backup (don’t assume your webhost has it covered) • Basic Workstation elements – PC – Appropriate Conversion Hardware (scanners, audio players, etc. ) – Encoding/Conversion Software

Software Recommendations • Images and Docs: Nothing Beats Photoshop and Acrobat, BUT they are

Software Recommendations • Images and Docs: Nothing Beats Photoshop and Acrobat, BUT they are difficult to use. • OCR: Omni. Page or ABBBY Fine Reader; but first use Acrobat • Audio: Audacity or Adobe Audition • Video: Cyberlink Power Director? ? Final Cut Express? ?

Conversion Process Discussion

Conversion Process Discussion

4. Putting it all together: Workflow • Keep it simple! • Elements – File

4. Putting it all together: Workflow • Keep it simple! • Elements – File Naming/Handling Plan – Basic Analog to Digital Conversion – Post-Conversion operations – Item Description – Storage – Access Plan

File Naming Plan • Above all else: Be consistent and clear • Be careful

File Naming Plan • Above all else: Be consistent and clear • Be careful using too many subfolders, one per collection is best • Link to Collection ID • Unambiguous and sortable numeric syntax: • COLLECTIONIDBox. Folder. Item – 2604001 -10 -2 -001. tif • Or with name – 2604001 -10 -2 -001 Photo. Of. President. tif • If you get stuck: – http: //www. den 4 b. com/projects. php

Analog to Digital Conversion • Provide Clear direction to vendor, OR address: – Careful

Analog to Digital Conversion • Provide Clear direction to vendor, OR address: – Careful Workstation design – Software/hardware selection and configuration – Pre conversion services, e. g. conservation – Staff training and supervision – Daily backups – Etc, etc.

Post-conversion File Operations • Ensure preservation of ‘as converted copy’ • Cleanup routines •

Post-conversion File Operations • Ensure preservation of ‘as converted copy’ • Cleanup routines • Create access copies if necessary (run daily batch process) • Move or copy files to permanent storage space. • CD/DVD Storage?

Description • Track in ONE location • Ensure minimal amount for each item: Title,

Description • Track in ONE location • Ensure minimal amount for each item: Title, dates • Value-added description – Creator, subjects, scope/content note • Example via http: //sandbox. archon. org/

Storage • Covered later in detail • Impt. point: movement to storage must be

Storage • Covered later in detail • Impt. point: movement to storage must be foolproof part of computer or human system • Storage location MUST be secure (few have access) • At least double redundancy

Access Plan • Who, what, where, when, why, how • Elements – – –

Access Plan • Who, what, where, when, why, how • Elements – – – Statement of principles Public vs. non-public content Copyright Fees for any content? Method of access Tracking use (Analytics) • Reality: Not all access will be open, be prepared to manage it • http: //www. library. illinois. edu/archives/services/

Questions? ? ?

Questions? ? ?

Digitizing Historical Records and Archives National Archives Conference for Fraternities and Sororities July 17,

Digitizing Historical Records and Archives National Archives Conference for Fraternities and Sororities July 17, 2010 Urbana, Illinois Christopher J. Prom, Ph. D Assistant University Archivist and Associate Professor University of Illinois at Urbana-Champaign prom@illinois. edu