Using Multiple Metadata Formats in DSpace ARD Prasad
















- Slides: 16
Using Multiple Metadata Formats in DSpace ARD Prasad Indian Statistical Institute Bangalore, India
MARC & Metadata • Covers all types of documents (more than 1000 elements) • Basically used for OPACs • Plethora of MARCs • Requires librarians • Uses ISO-2709, XML • Separate schemes for each type of document • Web documents, digital libraries • Plethora of meta data formats • Meant for non-librarians • Uses XML
Metadata Formats • More than two dozen formats available for every conceivable digital object • • • ETD E-Learning E-Governance Geo-Spatial Data Architectural Drawings Museum items
List of Some Metadata Formats • • • DC METS MODS VRA Core SCORM LOM GEM EAD TEI CIMI PB Core VRA Core • • • IMRC CDWA CSD GM MIDAS VERS DDI PREMIS CIDOC ETDMS AGLS GILS ONIX
DSpace • Default workflow supports Qualified Dublin Core • DSpace OAI supports unqualified Dublin Core • DSpace v. 1. 2. 2 allows you to extend to Non-DC formats
Metadata Issues in DSpace • • • Adding New Elements Input Forms Indexing Display of search results Import/Export OAI-PMH and crosswalks
Adding New Elements • Dublin Core Registry using DSpace administration • Directly adding to ‘dctyperegistry’ table in Postgre. SQL
Input Workflow • Using the new facility by modifying – $DSPACE_HOME/config/inputforms. xml
Indexing • Adding the elements to be indexed in dspace. cfg file, so that Lucene generates indexes on desired elements
Search Result Display • Display full record need not be modified • Default display can be changed by modifying Item. Tag. Java file
Import & Export • Within DSpace community, it really does not matter, though DSpace produces DClike format in a file called dublin_core. xml file • However, across other DL software, we should evolve interoperability mechanism
OAI-PMH • New format should appear in OAICat. properties file • Java programs should be written similar to that of OAIDCCrosswalk. java, for each metadata format
Issues of Crosswalk • Crosswalk will always result in some data loss • One should use ‘selective harvesting by collection’, using appropriate – ‘metadata. Prefix’ verb and – ‘set’ verb for limiting the collection • One may consider DC as the lowest common denominator
Possible relations between two metadata formats • Crosswalk can be achieved, in case of – One to one – ideal, but not real – Many to one • Crosswalk will be lossy, in case of – One to many – One to none – None to one
Suggestions for DSpace Development • Inputforms. xml can be modular, so that inputforms can be defined in separate files for each format • Dctyperegistry should have an element ‘metadata format’ so that OAICat exposes metadata of records which were created using a specific format • Perhaps OAI-PMH protocol itself requires modification (Imagine harvesting repositories with varied items and metadata formats)
Thank You Please visit: • LDL: Librarians’ Digital Library – https: //drtc. isibang. ac. in • SDL: Search Digital Libraries (Harvester) – http: //drtc. isibang. ac. in/sdl • Our Discussion Forum (DLRG) – http: //drtc. isibang. ac. in/dlrg