DLESE OAI Implementation NSDL and the Open Archives

  • Slides: 11
Download presentation
DLESE OAI Implementation NSDL and the Open Archives Initiative NSDL Annual Meeting October 14,

DLESE OAI Implementation NSDL and the Open Archives Initiative NSDL Annual Meeting October 14, 2003. Washington D. C. John Weatherley (jweather@ucar. edu) Digital Library for Earth System Education (DLESE)

Presentation Overview n n n DLESE and OAI Implementation architecture Resumption tokens Deletions Extending

Presentation Overview n n n DLESE and OAI Implementation architecture Resumption tokens Deletions Extending OAI to support remote searching 2

DLESE and OAI n The Digital Library for Earth System Education (DLESE) n n

DLESE and OAI n The Digital Library for Earth System Education (DLESE) n n n Community based effort made up of scientists, educators and students Library resources are comprised of many independently managed thematic collections (currently thirteen) OAI-PMH adopted as a primary means of gathering metadata from contributors to a central repository A single public OAI data provider serves the metadata from the DLESE central repository – one set per collection Each collection is harvested and identified separately by the NSDL DLESE OAI software developed to support interoperability among contributors n Offered under an open source license to the DLESE, NSDL and OAI communities 3

Implementation Architecture n Basic Approach: Database + Files n n n Simple database used

Implementation Architecture n Basic Approach: Database + Files n n n Simple database used to implement searching by set, format and date (from, until) and to store deletion status XML files used as the operating data store Thread monitors the files to detect additions, deletions and changes in file modification date Additions, deletions and changes in file modification date trigger a change in the OAI date stamp for each item Tools: Java J 2 EE, Lucene (database/IR search API), JSPs used for OAI-PMH responses and admin pages, Struts (web application framework), Dom 4 j XML parser 4

Resumption Tokens n All state is encoded into the token n n Token exactly

Resumption Tokens n All state is encoded into the token n n Token exactly defines the database query for the current and next sequences in the list n n Advantage: no need to cache state on the server Database must return results deterministically Changes, additions and deletions that occur between sequences will be propagated now or in the next harvest Example token: <resumption. Token complete. List. Size="909" cursor="0"> 0/500/909/nsdl_dc/edmall/2002 -10 -07 T 18: 15: 37 z/null </resumption. Token> Offset to the current sequence Offset to the next sequence complete. List. Size metadata. Prefix set. Spec (may be null) From/until (may be null) 5

Deletions n Data provider n n n Files that are removed are flagged as

Deletions n Data provider n n n Files that are removed are flagged as deleted and date stamp is incremented If file is restored, item is restored and date stamp is incremented Harvester n n n Maintains a mirror of the remote repository When remote item is deleted, local item is deleted If remote item is restored, local item is restored 6

Extending OAI to Support Searching by Remote Clients n n n DLESE ODL search:

Extending OAI to Support Searching by Remote Clients n n n DLESE ODL search: a web service developed to enable remote keyword, Boolean and field/value IR searches over the data repository Search query is passed in the set argument for List. Records or List. Identifiers requests Responses are returned using a standard OAIPMH response container Stateless - pushes flow control to the client Is an example of a Representational State Transfer (REST) style web service Similar to the Open Digital Library (ODL) search 1 specification 7

DLESE ODL Search Request Search request issued by the client is a standard List.

DLESE ODL Search Request Search request issued by the client is a standard List. Records or List. Identifiers request with the search query inserted into the set argument. Example request: http: //www. dlese. org/oai/provider? verb=List. Records&metadata. Prefix=nsdl_dc &set=dleseodlsearch/ocean/null/0/10 Metadata format Search query ‘dleseodlsearch’ indicates this is a search request rather than a standard List. Records or List. Identifiers request The set to search over, or null to search all Response type Number of records to return in the response Offset in the results list into which results should begin Full specification is available at: http: //www. dlese. org/oai/docs/index. html#odl 8

DLESE ODL Search Response Search response is a standard OAI List. Records or List.

DLESE ODL Search Response Search response is a standard OAI List. Records or List. Identifiers response that contains a segment of the ordered list of matching results. Example response: <? xml version="1. 0" encoding="UTF-8" ? > <OAI-PMH xmlns="http: //www. openarchives. org/OAI/2. 0/" xmlns: xsi="http: //www. w 3. org/2001/XMLSchema-instance" xsi: schema. Location="http: //www. openarchives. org/OAI/2. 0/OAI-PMH. xsd"> <response. Date>2003 -10 -09 T 18: 03: 49 Z</response. Date> <request verb="List. Records" set="dleseodlsearch%2 Focean%2 Fnull%2 F 0%2 F 10“ metadata. Prefix="nsdl_dc">http: //www. dlese. org/oai/provider</request> <List. Records> <record> <header> <identifier>oai: dlese. org: NASA-Edmall-2375</identifier> <datestamp>2003 -09 -02 T 15: 25: 44 Z</datestamp> <set. Spec>edmall</set. Spec> </header> <metadata> <nsdl_dc: nsdl_dc xmlns: nsdl_dc="http: //ns. nsdl. org/nsdl_dc_v 1. 01" xmlns: dc="http: //purl. org/dc/elements/1. 1/" xmlns: dct="http: //purl. org/dc/terms/" xmlns: ieee="http: //www. ieee. org/xsd/LOMv 1 p 0" xmlns: xsi="http: //www. w 3. org/2001/XMLSchema-instance" schema. Version="1. 001" xsi: schema. Location="http: //ns. nsdl. org/nsdl_dc_v 1. 01 http: //ns. nsdl. org/schemas/nsdl_dc_v 1. 01. xsd"> <dc: title>What is Ocean Color</dc: title> <dc: description>Explanation of ocean color composition, as indicator of biotic factors and parameter for other variables in remote sensing</dc: description> … </record> … <resumption. Token complete. List. Size="361" cursor="0" /> </List. Records> </OAI-PMH> Request element shows the query that was performed and the segment returned The resumption. Token indicates the complete list size of matching results and the offset into the list upon which the current segment of results begins 9

DLESE ODL Search Clients use the web service to perform remote searches and customize

DLESE ODL Search Clients use the web service to perform remote searches and customize the display of results. Example: Idea. Keeper: 10

Resources n Workshop presentation slides, links to tools and other OAI resources are located

Resources n Workshop presentation slides, links to tools and other OAI resources are located on the session Swiki and at: http: //www. dlese. org/libdev/interop/index. html n DLESE OAI software download and installation: n OAI-PMH v 2. 0 specification: n List of registered data providers: n Repository explorer: n Open Digital Libraries: n Representational State Transfer (REST): http: //sourceforge. net/project/showfiles. php? group_id=23991 http: //www. openarchives. org/OAI/openarchivesprotocol. html http: //www. openarchives. org/Register/Browse. Sites. pl http: //oai. dlib. vt. edu/cgi-bin/Explorer/oai 2. 0/testoai http: //oai. dlib. vt. edu/odl/ http: //internet. conveyor. com/RESTwiki/moin. cgi/Short. Summary. Of. Rest 11