Digitization for Digital Scholarship Krista Jamieson Digitization Manager

  • Slides: 22
Download presentation
Digitization for Digital Scholarship Krista Jamieson Digitization Manager k. jamieson@mcmaster. ca

Digitization for Digital Scholarship Krista Jamieson Digitization Manager k. jamieson@mcmaster. ca

Outline ● ● ● ● What is digitization Selection Preparation & Metadata Specifications by

Outline ● ● ● ● What is digitization Selection Preparation & Metadata Specifications by Media Type Processing Managing Digital Assets, Preservation, and Access Considerations & Pitfalls Mini Tour

What is Digitization? Process of making analog materials digital through scanning, digital photography, digital

What is Digitization? Process of making analog materials digital through scanning, digital photography, digital encoding, etc ● ● ● ● Paper Books Maps & Plans Photographs Video tape Tape cassettes Vinyl records

What is Digitization? ● Digitization is slow and manual ○ Scans can take upwards

What is Digitization? ● Digitization is slow and manual ○ Scans can take upwards of 5 minutes per item for paper-based, or length of media duration for AV ○ Digitization goes faster if using appropriate equipment, i. e. scanning a book on a flatbed is time consuming and results may not be ideal ● Need to prep material: ○ repairs to fragile items ○ removing staples & paper clips from loose paper

Selection Considerations when selecting materials to digitize: ● Is it material that will support

Selection Considerations when selecting materials to digitize: ● Is it material that will support your research? Pick which items you want *before* digitizing to save time! ● What prep and processing work will be needed for the items? Are there substitutes that would be easier? ● Use & Intentions: Will you be sharing the digital versions? Using them in a publication? Only sharing a scraped data set?

Selection ● Rights of the material: ○ If you are planning on sharing the

Selection ● Rights of the material: ○ If you are planning on sharing the digital versions you need to get copyright permission (yes, even if you are a student) ○ Privacy concerns? Sensitive materials? ○ Moral rights: is the use of this material appropriate for research? Is it appropriate to digitize? To share? May need to seek permission (beyond the legal copyright holder), especially if materials relate to marginalized communities or indigenous people

Preparation & Metadata ● Who owns the material may dictate how you are able

Preparation & Metadata ● Who owns the material may dictate how you are able to prepare material for digitization. Can you break the spine of a book? Replace the plastic case of a tape cassette or VHS? ● Goal is to be non-destructive. Even if the owner gives you permission, you may need to refer back to the original later on ● Best to remove any staples, paperclips, etc beforehand flatten any bent or folded items

Preparation & Metadata Give digitized files meaningful names! Titles or identifiers are good. Nothing

Preparation & Metadata Give digitized files meaningful names! Titles or identifiers are good. Nothing worse than a hundred “image 001. jpg” files If you are planning to share the files afterwards, make a spreadsheet with: ● ● ● ● title Identifier (if applicable) date (if known) who owns the material Creator Type of item (i. e. photograph, book, map, etc) any other relevant information (e. g. short description of what it is, keywords)

It’s finally time to digitize!

It’s finally time to digitize!

(Archival) Specifications by Media Type Note: Specifications vary depending on research needs Text (loose

(Archival) Specifications by Media Type Note: Specifications vary depending on research needs Text (loose paper, books, notebooks, etc): ● ● ● 300 DPI for good quality, text only, printed materials 600 DPI for handwriting, mixed text/ graphs/ images, poor quality 24 -bit RGB colour Uncompressed TIFF or JPEG Capture all the way to the edge so no content is cut off. Some people like a blank border to be safe

(Archival) Specifications by Media Type Photographs & Maps, Plans, Posters: ● ● 600 DPI

(Archival) Specifications by Media Type Photographs & Maps, Plans, Posters: ● ● 600 DPI 24 -bit RGB colour (even for B&W photos) Uncompressed TIFF or JPEG Capture all the way to the edge so no content is cut off. Some people like a blank border to be safe

(Archival) Specifications by Media Type Audio (including vinyl records, tape cassettes, reel to reel):

(Archival) Specifications by Media Type Audio (including vinyl records, tape cassettes, reel to reel): ● 44. 1 k. Hz or 48 k. Hz sampling rate ● 16 or 24 bit depth ● Uncompressed WAVE

(Archival) Specifications by Media Type Video (VHS, Beta, open reel): ● 96 k. Hz

(Archival) Specifications by Media Type Video (VHS, Beta, open reel): ● 96 k. Hz sampling rate ● 10 bit uncompressed ● MOV wrapper recommended (OS agnostic) Film: ● Contact an AV preservation firm. Equipment is prohibitively expensive.

Processing Digitized items are just straight captures ● Need to run Optical Character Recognition

Processing Digitized items are just straight captures ● Need to run Optical Character Recognition (OCR) to make text searchable ● Need to transcribe* handwriting ● Need to add markers, create transcriptions, etc depending on research for AV materials ● Different kinds of research may require specialized processing too!

Processing ● Maps & GIS processing was featured in a DMDS workshop in mid

Processing ● Maps & GIS processing was featured in a DMDS workshop in mid January (sorry!) ● OCR softwares include Adobe Acrobat, ABBYY Fine. Reader, Microsoft One. Note (can do handwriting too), Google docs, and many MANY others ● No OCR is 100% perfect. Depends on quality of scans, clarity of text in originals, etc. YOU WILL NEED TO MAKE CORRECTIONS ● Depending on research approach, may need to export OCR content into a separate file to make analysis easier

Screen caps of ABBYY & Acrobat

Screen caps of ABBYY & Acrobat

Managing Digital Assets, Preservation, and Access ● Digitization is step 1 of managing digital

Managing Digital Assets, Preservation, and Access ● Digitization is step 1 of managing digital assets ● At minimum, you will need to ensure the digitized content is stored adequately until you are finished your research ○ Have multiple backup copies stored in different, secure places ○ Be cautious with cloud based storage, as many services are not secured adequately to meet REB requirements (if applicable) ○ Mac. Drive is a viable Mc. Master based option for backups ● Some grants and local policies may require you to digitally archive the research data you have created through digitization for confirmation purposes

Managing Digital Assets, Preservation, and Access ● Digital content doesn’t inherently make it accessible

Managing Digital Assets, Preservation, and Access ● Digital content doesn’t inherently make it accessible or preserved. ● Consider whether long term preservation and access would be beneficial to you, to other researchers, and whether its appropriate for the material ● Long term preservation and access requires robust and secure storage, ongoing maintenance and preservation work, and be somewhere other people can find it ○ Look into institutional repositories and research data management options (like Dataverse)

Considerations & Pitfalls 1. Think about what you’re going to do with your scans

Considerations & Pitfalls 1. Think about what you’re going to do with your scans before you make them! Long term planning needs to start at the beginning 2. Description of the content is what will make it discoverable later on. Contact your repository of choice for templates and requirements. 3. Sharing your corpus (when appropriate) can mean others can make use of work that you’ve already done. Format and completeness are something to consider for other users! 4. Always important to be explicit about what you are contributing and what you are not contributing digitally (i. e. what you did not digitize or are not making available) and why (out of scope? rights issue? )

Thank you! Questions? Krista Jamieson Digitization Centre L 108, Mills Memorial Library k. jamieson@mcmaster.

Thank you! Questions? Krista Jamieson Digitization Centre L 108, Mills Memorial Library k. jamieson@mcmaster. ca