The OAIPMH Harvester Plugin for The Omeka Content

  • Slides: 14
Download presentation
The OAI-PMH Harvester Plugin for The Omeka Content Management System LIS 654 BUILDING DIGITAL

The OAI-PMH Harvester Plugin for The Omeka Content Management System LIS 654 BUILDING DIGITAL LIBRARIES FALL 2011 NOVEMBER 03, 2011 JAMES R. GRIFFIN III 100356891

Defining the OAI-PMH • "The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is

Defining the OAI-PMH • "The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a low-barrier mechanism for repository interoperability. Data Providers are repositories that expose structured metadata via OAI-PMH. Service Providers then make OAI-PMH service requests to harvest that metadata. OAI-PMH is a set of six verbs or services that are invoked within HTTP. “ 1 • Thus, the OAI-PMH is a means by which to enable digital repositories to openly and freely exchange and share metadata detailing their collections with the world. 1 Open archives initiative protocol for metadata harvesting. (2011). Retrieved from http: //www. openarchives. org/pmh/

Installing the OAI-PMH Harvester Plugin for Omeka 1. Download the plug-in from the following

Installing the OAI-PMH Harvester Plugin for Omeka 1. Download the plug-in from the following source: http: //omeka. org/add-ons/plugins/oai-pmh-harvester/ (Note: This is a ZIP archive [like other plug-ins for Omeka]) 2. Upload the ZIP archive to the server wotan (Note: This can be done using any scp client such as Win. SCP) 3. Decompress the archive into the appropriate directory for your installation of Omeka (Note: This is typically the path /home/[USER NAME]/omeka/plugins/) 4. Using the web interface, install the harvester plug-in

The Purpose Behind the OAI-PMH Metadata shared using the OAI-PMH is structured in a

The Purpose Behind the OAI-PMH Metadata shared using the OAI-PMH is structured in a uniform manner, ensuring that metadata for all collections shared on the World Wide Web can be harvested regardless of the specific application For example, one institution can archive content using the Drupal application as a repository, while another institution can archive content using Omeka Using the OAI-PMH protocol, both repositories can be configured to exchange information detailing the contents of their archived collections.

Repository Interoperability Unfortunately, not every digital repository has been developed using the same framework

Repository Interoperability Unfortunately, not every digital repository has been developed using the same framework (or even the same programming language[s]) Thus, if OAI-PMH were to attempt to institute language-specific standards for exchanging metadata, inevitably some repository application would be developed in an unsupported language The solution to this is the software object

OAI-PMH Metadata Objects For the purposes of this presentation, a software object is a

OAI-PMH Metadata Objects For the purposes of this presentation, a software object is a means by which to structure data in a languageindependent manner As the OAI-PMH Initiative seeks to establish their contribution as the definitive standard for the exchange of repository metadata, this will increase the likelihood that future repository applications (some of which will be written in currently non-existent [i. e. future] languages) will still employ this protocol

OAI-PMH Metadata Objects The metadata objects are transferred over the Hyper. Text Transfer Protocol

OAI-PMH Metadata Objects The metadata objects are transferred over the Hyper. Text Transfer Protocol (HTTP) This means that no platform-specific binaries must be employed in order to harvest OAI-PMH-compliant metadata (e. g. Anyone can access information detailing the contents of these archived collections using a web browser – you do not need to purchase or install any additional software)

OAI-PMH Metadata Objects The metadata objects are bound to/serialized using the e. Xtensible Markup

OAI-PMH Metadata Objects The metadata objects are bound to/serialized using the e. Xtensible Markup Language (XML) This is mentioned for the sake of those who are enrolled in LIS 650, those who have previously taken LIS 650, or those who are familiar with web design For those unfamiliar with XML or web design itself, this simply means that this metadata can be extended and manipulated easily by web designers as well as developers

An Instance of an OAI-PMH Metadata Object In order to generate OAI-PMH-compliant metadata objects

An Instance of an OAI-PMH Metadata Object In order to generate OAI-PMH-compliant metadata objects for one’s collection, one must first install and configure another plugin: The OAI-PMH Repository (http: //omeka. org/add-ons/plugins/oai-pmh-repository/) Retrieving metadata from the repository: http: //wotan. liu. edu/omeka/jgriffin/oai-pmh-repository/request? verb=List. Records&metadata. Prefix=oai_dc The parameter “verb” specifies to wotan precisely what is being requested (e. g. A list of my collections – “List. Record”) The parameter “metadata. Prefix” specifies to wotan precisely which metadata framework to use in the formatting of the response (e. g. “oai_dc” is the OAI’s format which is based upon the Dublin Core framework)

An Instance of an OAI-PMH Metadata Object This was retrieved by requesting the following

An Instance of an OAI-PMH Metadata Object This was retrieved by requesting the following resource: http: //wotan. liu. edu/omeka/jgriffin/oai-pmh-repository/request? verb=List. Records&metadata. Prefix=oai_dc <OAI-PMH xsi: schema. Location="http: //www. openarchives. org/OAI/2. 0/OAI-PMH. xsd"> <response. Date>2011 -11 -03 T 19: 46: 59 Z</response. Date> <!-- When I requested this object --> <request verb="List. Records" metadata. Prefix="oai_dc"> <!-- Which parameters were passed to wotan --> http: //wotan. liu. edu/omeka/jgriffin/oai-pmh-repository/request </request> <List. Records> <!-- A detailed listing of the collection records --> <record> <header> <identifier>oai: wotan. liu. edu/omeka/jgriffin/: 5</identifier> <datestamp>2011 -10 -22 T 00: 48: 49 Z</datestamp> <!– Record creation time --> <set. Spec>6</set. Spec> </header> <metadata> <oai_dc: dc xsi: schema. Location="http: //www. openarchives. org/OAI/2. 0/oai_dc/http: //www. ope[. . . ]"> <!-- The Dublin Core Elements --> <dc: title>/src/bin/psql. c</dc: title> <dc: creator>Regents of the University of California</dc: creator> <dc: publisher> […] </metadata> </record> </List. Records> </OAI-PMH>

Harvesting Metadata from Remote Repositories in Omeka The plugin has its utility in its

Harvesting Metadata from Remote Repositories in Omeka The plugin has its utility in its ability to directly import data detailing items archived in a remote repository into one’s own repository Conceptually, the mechanisms underlying this process are similar to those used in the practice of “copy cataloging”

Harvesting Metadata from Remote Repositories in Omeka As previously specified, the server must be

Harvesting Metadata from Remote Repositories in Omeka As previously specified, the server must be running an OAI -PMH repository for the archived collections In order to demonstrate this, I can harvest from my own OAI-PMH repository: http: //wotan. liu. edu/omeka/jgriffin/oai-pmh-repository/request …as well as from L’Université Rennes 2 de la Bibliothèque Numérique*: http: //bibnum. univ-rennes 2. fr/oai-pmh-repository/request? verb=List. Records&metadata. Prefix=oai_dc *This source was specified by Sheila Brennan of the Roy Rosenzweig Center for History and New Media. Please see http: //omeka. org/blog/2011/08/29/do-you-share-your-data/

Harvesting Metadata from Remote Repositories in Omeka Metadata sets can be re-harvested or deleted

Harvesting Metadata from Remote Repositories in Omeka Metadata sets can be re-harvested or deleted While a set of records are being harvested, one is offered the ability to “kill” the process Should there be problems regarding the memory required by the harvester, one can modify the settings of the plugin The “Memory Limit” field should only be modified if a harvest fails due to an error. The path for the PHP binary should always be ‘/usr/bin/php 5’ on wotan

The OAI-PMH Harvester Plug-In for the Omeka Digital Archive Questions? Comments?

The OAI-PMH Harvester Plug-In for the Omeka Digital Archive Questions? Comments?