Best Practices Tools and Techniques for Digitizing Library
Best Practices, Tools, and Techniques for Digitizing Library Materials --A Snapshoot of Library Digitization Practice in the US Yan Quan Liu, Ph. D School of CLIS, SCSU, USA Email: Liu@southernct. edu ICDL 2004 liu@southernct. edu
& Background introduction: A third of academic libraries and a quarter of public libraries are involved in digitization efforts; however, many of these libraries do not have policies to control the format and execution of such efforts(IMLS survey, 2002). z Several thousand libraries have mounted digital library files that are not compatible, cross-searchable, nor, in many cases, easily integrated (Walker, 2003). z ICDL 2004 10 -Sep-21 liu@southernct. edu 2
Outline 1. Kinds of materials 2. Rules, guidance and/or standards 3. Technological problems/concerns 4. Conclusion ICDL 2004 10 -Sep-21 liu@southernct. edu 3
1. What kinds of materials are being digitized? & University libraries: digitizing archives of newspapers, artifacts from other countries, coins, art, music, children’s literature, and historical documents and images of international and cultural interest, particularly from early American and European history. z Public libraries: focusing their attention on smaller, local historic collections. z School libraries: creating virtual libraries, primarily linking to other larger libraries’ online resources. (Expense, time) z Other types of libraries: meeting special needs of specific populations. (Canadian National Institute’s digital library ) z ICDL 2004 10 -Sep-21 liu@southernct. edu 4
& 1. 1 Academic libraries z Tend to more involved than those of public or school libraries. ù Possible reasons: have greater access to historic artifacts/documents, as well as more adequate financial resources such as federal funds and foundation assistance. z Often in collaboration with national libraries and museums. ù The University of Maryland: working with the International Children’s Library and the Internet Archive “to create an extensive library of children’s literature. ” ICDL 2004 10 -Sep-21 liu@southernct. edu 5
z Best ù ù ù practices: The Harvard Law School Library is digitizing documents relating to the Nuremberg war crimes trials. Digital collections allow the public and researchers to view original photographs and documents, and to listen to historical speeches for themselves. Brown University Library: 1, 500 pieces of sheet music associated with African Americans; Johns Hopkins University: popular American sheet music from 1780 to 1960; Indiana University and the University of Michigan: creating an online digital archive of ethno-musicological videos. U. C. Berkley’s Digital Scriptorium is digitizing medieval and renaissance manuscripts. ICDL 2004 10 -Sep-21 liu@southernct. edu 6
& 1. 2 Public libraries z Have begun to engage in digitization projects more recently. z Most public libraries are taking on smaller and more local projects. ù ù Newspaper articles, photographs, essays, letters, contracts, etc. Possible reason: budget constraints, in addition to their interest in serving local community needs. z. A survey conducted by Sally in public libraries serving a population greater than 50, 000. (1997) ICDL 2004 10 -Sep-21 liu@southernct. edu 7
z Results from Sally’s survey: Chosen materials Photographic collections Manuscripts Books and diaries Percentage 77. 1% 31. 2% 28. 6% Postcards Maps and newspapers Sound recordings 25. 7% 14. 3% 2. 9% Other materials” 20% ICDL 2004 10 -Sep-21 liu@southernct. edu 8
z Best ù ù practices: The Alexandria Library has an online exhibition devoted to local postcards from 1707 to the 1980's, and has also digitized inter alia Civil War correspondence, and information relating to historic town buildings. The Internet Archive o A public, non-profit organization with the goal of ensuring open, free, and permanent access to digital collections of historical and cultural artifacts. o Saving records of culture and civilization, including text, audio, and moving images o The Wayback Machine ICDL 2004 10 -Sep-21 liu@southernct. edu 9
2. What standards or guidelines are used in library digitization? & z Metadata standards z Quality related standards and guidelines z Copyright ICDL 2004 10 -Sep-21 liu@southernct. edu 10
& 2. 1 Metadata standards MARC standard : used in the Library of Congress’s American Memories digital library projects and Brown University’s African-American Sheet Music Project. z Dublin Core’s fifteen element metatagging fields have received positive and negative feedback due to its simplistic nature. z EAD standard: an “international and interdisciplinary standard that helps libraries, museums, publishers, and individual scholars represent all kinds of literary and linguistic texts for online research and teaching, using an encoding scheme that is maximally expressive and minimally obsolescent. ” z Encoding format: SGML, XML, RDF. z ICDL 2004 10 -Sep-21 liu@southernct. edu 11
z The Metadata Encoding and Transmission Standard (METS) : ù ù ù Being developed as an initiative of the Digital Library Federation. A standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language of the Web Consortium. Can express the complex links between various forms of metadata, even more easily than Dublin Core can, therefore provide a useful standard for the exchange of digital library objects between repositories. ICDL 2004 10 -Sep-21 liu@southernct. edu 12
& 2. 2 Quality related standards and guidelines z Library initiatives are setting standards for the minimal dpi used, bit-depths, and compression and file formats for the digital library gallery and also for the master copies. ù Brown University’s African American sheet music collection image specifications include maintaining master images in high-quality IFF format at 300 dpi, while requiring the gallery image to be in JPEG formats in three specific pixel sizes and tonal depths ICDL 2004 10 -Sep-21 liu@southernct. edu 13
ù ù California State University: “the same content as the print version”, “the illustrations should retain original colors and have a minimum resolution of 600 dpi. ” The Digital Scriptorium o Images are created by a photographic/digital process or by direct digitization, depending on which university does the digitizing. o The results are viewable in four resolutions: “thumbnail (for browsing: GIF, 220 × 220 pixels); life size (for overall viewing: JPEG, 75 pixels per inch), twice life size (JPEG, 150 p. p. i. ), and higher resolution (for details: JPEG, 300 d. p. i)”. ICDL 2004 10 -Sep-21 liu@southernct. edu 14
& 2. 3 Copyright Under Section 108 of the Copyright Law, libraries and archives have the right to make copies of works under certain conditions even if the work is copyrighted. This means that the library or archive must not make copies for commercial advantage and any copying must include a notice of copyright. z “Sail the ocean blue through 1922. ” (Minow, 2002) z If the item in question is not within the last twenty years of its copyright, by law, you cannot post the digitized copies on the Web; they can only be digitized and used in -house. z ICDL 2004 10 -Sep-21 liu@southernct. edu 15
z Copyright statements: ù ù z “The photographs … have been reproduced in digital format expressly for this project as a one-time use, and may not be reproduced or copied in any format for any use other than personal study” (The Multnomah County Library). “All images, text, and other materials in this site are protected under United States copyright laws, and may only be used in manners that constitute ‘fair use’ as defined by federal and international legal systems” (The Louisiana State University Photographic Collection) International rules and regulations are in the process of being developed and standardized. ICDL 2004 10 -Sep-21 liu@southernct. edu 16
3. What technological issues pose concerns for libraries in digitization practice? & z Software ù Ideally, image processing software should allow for curvature correction and tidying up of the image created, meaning the actual item should remain true to its’ original condition. ù A trend is an increased use of scaled-back and "simplified" photo imaging software applications, such as Paint Shop Pro and "limited" versions of Adobe Photo. Shop. z OAI-PMH z Other issues ICDL 2004 10 -Sep-21 liu@southernct. edu 17
z OAI-PMH ù The Open Archives Initiative (OAI) were based upon two goals: o To develop a protocol that relied on existing Internet protocols (HTTP, IP, and TCP) to simply transfer metadata. o To develop a protocol that also relied on existing metadata standards, mandating the fifteen Dublin Core elements as the minimum index requirements for documents. ù ù OAI released their first version of the Protocol for Metadata Harvesting (PMH) in January 2001 (the second and current version was released in June 2002). It is the flexibility of OAI-PMH that is most appealing to librarians. ICDL 2004 10 -Sep-21 liu@southernct. edu 18
z Other issues ù What media will be included in the collection? What is the best way to protect digital information from degradation? What is the plan for data migration? Will the collection include items that were “born digital”? ù Is it necessary to establish arrangements with other libraries in order to provide access to a sufficient range and number of items? What guidelines and rules must be followed in order to be part of the sharing? How does sharing/collaborating affect hardware, software, and networking decisions? ICDL 2004 10 -Sep-21 liu@southernct. edu 19
& 4. Conclusions z Access ù v. s. preservation LC, Cornell, Harvard etc. all include conservation and preservation practices, thus digitization for these can be one strategy for preservation. o Nuremberg Trials Project o Holocaust History Project ù many digital projects also endeavor to increase access to a range of materials in an effort to further education, awareness and research. z Interoperability ICDL 2004 10 -Sep-21 liu@southernct. edu 20
&Acknowledgement z Those survey sponsors. z My students who collected data and other information. & 39 References The End! ICDL 2004 10 -Sep-21 liu@southernct. edu 21
Any Questions? Thank you all! ICDL 2004 10 -Sep-21 liu@southernct. edu 22
- Slides: 22