Digital Library Collections Services Roy Tennant California Digital
Digital Library Collections & Services Roy Tennant California Digital Library
Questions, Questions • You will leave with more questions than answers • If I do my job right, they will be the right questions • Feel free to ask questions as we go along
“It was the best of times, it was the worst of times…” — A Tale of Two Cities, Charles Dickens
The Common Perception
The Reality • Too many information sources • A lack of human assistance • Not enough ways to filter, sort, and narrow in on what is needed • Access is limited to what is free, or what has been purchased or rented on behalf of a clientele • Many useful resources are only available in print
The Most Commonly Proposed Solution
Digital Library Myths • Having everything in digital form will solve our information access problems • Soon (or eventually) everything will be digital • Any collection of digital objects can be a digital library • Everyone agrees about what comprises a “digital library” and how to build one
A Digital Library Is… • A collection of digital objects and/or information that is: – – Selected Organized Made Accessible Preserved • A set of services that help you to find and use those objects and information • Often supported by a physical collection and always by professional staff
Outline • Digital Library Collections – – Licensing or Buying Collections Digitizing Collections Publishing Providing Access to Remote Collections • Digital Library Services – Library Catalogs – Metasearching – Online Reference
Digital Library Collections • Licensing or Buying Collections • Digitizing Collections • Publishing
Licensing or Buying • Licensing more common than buying (but what do you have in the end? ) • Libraries are increasingly demanding ownership, and/or content held in escrow • “The time to advocate change is before you sign” - Beverlee French, CDL
Digitizing Collections • Start and end with your users and the services you wish to provide • Review what others have done • Digitize at the highest quality that you can, and save an unprocessed copy • Capture as much metadata as you can, in highly granular fields, and store it in a form from which you can extract it without loss
Metadata • “Cataloging by those paid better than librarians” • Structured information about an object or collection of objects • Types: – – Descriptive Administrative Structural Preservation
Core Metadata Standards • Dublin Core: a set of basic fields primarily for systems interoperability • MODS: a MARC-like bibliographic format • METS: a structural standard for encapsulating a digital object or set of digital objects, including one or more segments of descriptive and/or administrative metadata
Publishing • Libraries are increasingly becoming involved with publishing activities • University libraries are capturing scholarship before it leaves campus, and making it freely available to all • Two examples: – Repositories – Book publishing
Repositories • Two flavors: – Institutional (e. g. , MIT) – Topic (e. g. , Physics) • Characteristics: – Often author-maintained; therefore metadata may be of uneven quality/quantity – Usually compliant with the Open Archives Initiative harvesting protocol • Benefits: – Captures a grey literature not always collected by libraries – If OAI-compliant, can be “crawled” and indexed
http: //arxiv. org/
Dspace screen shot http: //dspace. org/ http: //dspace. mit. edu/
http: //repositories. cdlib. org/
Books & Journals • Academic libraries, faculty, and university presses are teaming up: – Faculty write and edit – Libraries provide technical expertise, online access, persistence, professional collection management – University presses provide editing, print publication, imprimatur, marketing • Case Study: University of California
XML • A method of creating and using tags to identify the structure and contents of a document — not how it should be displayed • The tags used can be arbitrary or can come from a specification • XML is instrumental for sharing information between applications
Transforming XML • XML Stylesheet Language — Transformations (XSLT) – A markup language and programming syntax for processing XML – Used to transform XML to another format (e. g. , to HTML for delivery to standard web clients) or from one set of tags to another • An XML parser • A method to bring all the pieces together if serving to the web (e. g. , CGI program, Java servlet, etc. )
Transformation Information XSLT Stylesheet Book encoded in XML Web Server Presentation XHTML Document (no display markup)* HTML Stylesheet (CSS) * Dynamic document
XML & XSLT Demonstration
Library Catalogs • We seem to be unable to provide an easy and effective information locating tool • Keep in mind that only librarians like to search, everyone else likes to find • We are even failing at things we have explicitly tried to do • Let’s take a look at the evidence…
Typical Searches • Known Item • “A Few Good Things” • Comprehensive
Typical Searches: Known Item • The good: searches can be limited to a particular field: author, title, etc. • The bad: limiting to a particular field doesn’t always act the way you expect
Typical Searches: “A Few Good Things” • The one type of search we have so far ignored in library system design • A type of search that we can do something about today • Bring Google-style relevance to library catalogs (e. g. , for union catalogs, sort by number of holding libraries)
Typical Searches: Comprehensive • Most library catalogs hide many things available via regional cooperative or ILL • It is difficult to search all appropriate journal databases • Most libraries do not provide good access to gray literature and web sites • Subject headings are often unintuitive, and catalogs give no guidance
The Rescue of Print • Many library users want only that which is convenient (read digital) • Print resources are, therefore, increasingly overlooked (I call this the “convenience catastrophe”) • We must fight this trend by enriching our catalog records with tables of contents, indexes, book covers, etc. to entice users to print books
Metasearching • Prevents the user from having to: – know which database to search – search each database individually – know the particular commands to search in each database • An incredibly complex problem that will likely take years to come close to solving well
Source: ARL Statistics http: //searchlight. cdlib. org/cgi-bin/searchlight
Slide from Greg Van Essen, Endeavor
Digital Reference • Putting the human help where it’s needed — online • Software is now available that provides for: – – – – Queuing of patrons with audible alerts Chat between librarian and user Push web pages to the user Form sharing Highlighting on the user’s screen “Follow me” browsing Saved and/or emailed transcripts Statistics
Interoperability • The digital library “holy grail” • Main requirement: widespread adoption of specific standards and protocols • Progress: – XML as the basic syntax – OAI provides a harvesting model – METS, MODS, and DC are key metadata standards – Technologies such as Web Services provide realtime interoperability
Understanding the Landscape • We must provide access to more resources than ever before • Many are digital, some are not • Some are interoperable, many are not • We need to find ways to build unified user services from a disparate collection of resources • Tools and strategies for doing this are becoming available (see OCLC Research Works, for example)
Trends • More and faster change (“change is the only constant”) • Better control of, and access to, historically difficult to access materials (e. g. , working papers, data sets, etc. ) • More publication options: – Digital repositories – Institution-based peer-reviewed publication avenues (journals, books, etc. ) • A greater diversity of material types: – Multimedia, data, etc.
A Few Technologies and Trends to Watch • Repository systems • Systems for online peer review and publication • New kinds of “cataloging” (e. g. , Dublin Core, METS, MODS) • XML • Open Archives Initiative • Web services • Metasearching
The Right Questions to Ask • What do we need to do to serve our users better? • How can we build an infrastructure that can be used for a variety of purposes? • How can we better integrate access to print and digital material? • How can we interoperate with other systems and services? • What should we stop doing so we can do what is more important?
Toto, we’re not in Kansas anymore!
- Slides: 70