Digital Libraries Study into the features of the
Digital Libraries: Study into the features of the DSpace Suite Documentation Research and Training Centre Indian Statistical Institute Bangalore 560059 International Workshop On Building Digital Libraries, DRTC/ISI 7 th – 11 th March 2005 International Workshop On Building Digital Libraries, 1
Introduction Digital libraries encompass a whole range of information services related work such as – – – Organization of digital information Information retrieval User interface Archiving and preservation Services and social issues Evaluation and applications to particular areas International Workshop On Building Digital Libraries, 2
Desirable Features of DL Software • • Structures Accessible Searchable Extensible Massive Heterogeneous Persistent International Workshop On Building Digital Libraries, 3
DL’s operation should be examined under… • • • Architectural design – Modular and Open Backend Database – scalable, robust, data formats Network capabilities – web-based and seamless operations, persistent Ids, security and authentication • Metadata and Interoperability – compatible with world standards such as Dublin Core and OAI-PMH International Workshop On Building Digital Libraries, 4
Technical Issues • • Open source software Vs Commercial OS Hardware and peripheral requirements Network Components Standards – data formats, metadata, network, access, interoperability, encoding International Workshop On Building Digital Libraries, 5
Approaches to Building DL • Digitization – retro-conversion of non-digital resources to digital • Digitally born resources – involves interconversion to standard formats and storage International Workshop On Building Digital Libraries, 6
Why DSpace Digital Library DSpace is • An open source technology platform which can be customized and its capabilities can be extended • A service model for open access and/or digital archiving for perpetual access • A platform to build an Institutional Repository and the collections are searchable and retrievable by/on the Web • To make available institution-based scholarly material in digital formats. The collection will be open and interoperable. International Workshop On Building Digital Libraries, 7
Architecture and System Requirement The DSpace system is organized into three layers – The Storage Layer: responsible for physical storage of metadata and content – The Business Layer: deals with managing the content of the archive, users of the archive (e-people), authorization, and workflow – The Application Layer: containing components that communicate with the networked world outside of the individual DSpace installation, • for example the Web user interface and the modules for metadata harvesting service International Workshop On Building Digital Libraries, 8
Features of a near ideal DL • Low cost, including all hardware and software components • Technically simple to install and manage • Robust • Scalable • Open and inter-operable • Modular • User Friendly • Multi-user (including both searching and maintenance) • Multimedia digital object enabled • Platform independent (including both client and server components) interoperable International Workshop On Building Digital Libraries, 9
DSpace is a joint project of MIT Libraries and Hewlett-Packard Labs International Workshop On Building Digital Libraries, 10
What is DSpace? • • • Digital Object management system Create, search and retrieve digital objects Facilitate preservation of digital objects An open source software Allows open access and digital archiving Allows building Institutional Repositories International Workshop On Building Digital Libraries, 11
H/W and S/W requirements • UNIX recommended (Java-based program should run on anything) • Open source, built on Apache web server and Tomcat Servlet engine • Uses postgre. SQL or Oracle relational database International Workshop On Building Digital Libraries, 12
What DSpace can do? • Captures – Digital content in any formats directly from creators (e. g. researcher, authors) • Describes – Descriptive, technical, rights metadata – Persistent identifiers • OAI-PMH version 2. 0 compliant – Allow metadata creation International Workshop On Building Digital Libraries, 13
Possible types of Content • • • Preprints, articles Postprints Technical Reports Conference Papers Theses/Dissertations Datasets – e. g. statistical, geospatial, scientific International Workshop On Building Digital Libraries, 14
Formats of Content • Images – visual, scientific, etc. • Audio files • Video files • Digitized library collections International Workshop On Building Digital Libraries, 15
File Formats Supported: Repository administrator can inform the submitters which file formats will be supported in the future by his organization Known: recognizes the format, but cannot guarantee full support Unsupported: cannot recognize a format; these will be listed as "application/octetstream", -- Unknown International Workshop On Building Digital Libraries, 16
Information Model • Communities – Departments, Labs, Research Centers, Schools… • Collections • Items • Files (bitstreams) – Multiple formats - same content – Complex objects – multiple files International Workshop On Building Digital Libraries, 17
Intellectual Property • Click-through license during submission • Grants DSpace non-exclusive right to acquire, manage, preserve, distribute the item • Does not grant DSpace copyright • Copy of license stored with item International Workshop On Building Digital Libraries, 18
Goodies • Modular architecture, well-defined APIs • 100% open source – Programmed in java – RDBMS and SQL for metadata • CNRI “handles” for persistent identifiers • Open. URL linking • OAI-PMH for exposing metadata International Workshop On Building Digital Libraries, 19
Backend Technology • • • Apache, Tomcat, Open. SSL/mod_ssl Java Postgre. SQL/Oracle CNRI Handle System 5 (persistent ids) Lucene Search Engine International Workshop On Building Digital Libraries, 20
Standards • Dublin Core only – Descriptive metadata only • OAI-PMH v 2. 0 (Open Archive’s Initiative Protocol for metadata harvesting) • UNICODE Compliant International Workshop On Building Digital Libraries, 21
Capabilities • Exports in XML format • Supports crosswalks through OAI-PMH – – DC (Dublin Core) Qualified DC METS (Metadata Encoding and Transmission Standard MODS (Metadata Object Description Schema – sibling of MARCXML) • Can be extended to any Metadata Schema International Workshop On Building Digital Libraries, 22
Customization • • • Screens (Manakin) E-mails Any language interface Metadata Input-forms Display of results Fields to be Indexed Access restrictions License (in addition to Creative Commons) International Workshop On Building Digital Libraries, 23
Advanced Feature • • Grid Compliant (Storage) LDAP authentication Usage statistics generation SFX Server integration RSS (Really Simple Syndication) Item Recommendation to a friend Use of Thesaurus (though not OWL/SKOS/RDF) Full-text indexing of PDF, MS-WORD files International Workshop On Building Digital Libraries, 24
International Workshop On Building Digital Libraries, 25
International Workshop On Building Digital Libraries, 27
International Workshop On Building Digital Libraries, 28
International Workshop On Building Digital Libraries, 29
International Workshop On Building Digital Libraries, 30
Important Sites • • http: //www. dspace. org http: //www. sourceforge. net/projects/dspace http: //wiki. dspace. org http: //mailman. mit. edu/mailman/listinfo/dspacegeneral • http: //lists. sourceforge. net/lists/listinfo/dspace-tech • http: //lists. sourceforge. net/lists/listinfo/dspacedevel International Workshop On Building Digital Libraries, 31
DRTC Sites • • • https: //drtc. isibang. ac. in (Librarians' Digital Library) http: //drtc. isibang. ac. in/dlrg (Discussion Forum) http: //drtc. isibang. ac. in/sdl (Harvester in LIS) http: //drtc. isibang. ac. in/blog International Workshop On Building Digital Libraries, 32
Questions? International Workshop On Building Digital Libraries, 33
Thank You International Workshop On Building Digital Libraries, 34
- Slides: 34