CERN Library Requirements T Baron CERN ETTDHCDS 07112002
CERN Library Requirements T. Baron CERN ETT-DH-CDS 07/11/2002 Thomas Baron - JACo. W Workshop 1
Outlines u CERN Library Wishes u JACo. W Records in CERN Library u A Possible Solution: OAI 07/11/2002 Thomas Baron - JACo. W Workshop 2
CERN Library Wishes (1) u In u u u CERN Library Mission Statement: Keep track of all CERN production Maintain it for long term Produce an annual report each year u Obligation of getting information from JACo. W u Up to 2002: “semi-manual” import u u u Lot of human efforts Delay in publishing Difficulty to achieve completeness 07/11/2002 Thomas Baron - JACo. W Workshop 3
CERN Library Wishes (2) u Automatic Procedure u To get metadata u Language u Author(s) u Name u Affiliation u Title u Conference: u Name u Place u Date u Abstract u [Number of pages] u Link to fulltext document 07/11/2002 Thomas Baron - JACo. W Workshop 4
CERN Library Wishes (3) u Automatic u Procedure Possibly to get Fulltexts – 2 solutions: u Keep only the links to the JACo. W fulltext archive u URL stability? u Fetch the fulltexts and store them in the CERN Document Server u Long time storage place u Added Services 07/11/2002 Thomas Baron - JACo. W Workshop 5
JACo. W Records in CERN Library (1) u Metadata Enhancement: u u u Normalisation (authors, experiments…) Completeness (reference numbers…) Added Services u u u Integration in a library catalogue u Multiple catalogue search (eg. : search for “Evans” inside “Conference papers”, “pictures” and “journal articles”) CDS Personalization tools u Alerts, baskets… On fulltexts: u Keyword extraction u Fulltext indexing u Automatic Conversion u Citation Extraction 07/11/2002 Thomas Baron - JACo. W Workshop 6
JACo. W Records in CERN Library (2) u Increased u Visibility: 156000 different users/year u Interlinking CERN-> SLAC u Launch search on other systems (Google, KEK, SLAC…) u HEPDOC: parallel search on CDS, KEK and SLAC 07/11/2002 Thomas Baron - JACo. W Workshop 7
A Possible Solution: OAI (1) u Open Archives Initiative: Protocol for exposing/harvesting metadata initiated by LANL people in 1999. u Aims at simplicity and easiness of implementation u Widely supported and used u Last workshop @ CERN in October with 140 participants from 20 countries 07/11/2002 Thomas Baron - JACo. W Workshop 8
A Possible Solution: OAI (2) u Two u u actors: Repositories (also called data providers) (JACo. W) Harvesters (also called service providers) (SLAC, CERN…) u Metadata u u representation: Unqualified Dublin Core metadata format mandatory Other metadata formats can be added u Qualified DC u Marc 21 u… u Relies 07/11/2002 on the http protocol Thomas Baron - JACo. W Workshop 9
A Possible Solution: OAI (3) u Repositories: u u u Serves the metadata on request from the harvester Each document has a unique ID Some “verbs” are implemented (http cgi scripts) The verbs serve the requested metadata Can group records into data sets (conferences) Harvesters: u u u Invoke repositories OAI verbs Time-driven requests (“I want all documents published/modified from this date”) Category-driven requests (“the documents should belong to the EPAC’ 2002 conf”) Can chose between the proposed metadata formats. Can directly access one record knowing its OAI id 07/11/2002 Thomas Baron - JACo. W Workshop 10
A Possible Solution: OAI (4) u Technical u u Implementation: Repository: u Create unique references for each record u Date stamp all records for creation and modification u Implement the 6 OAI requests (verbs) u Reduced work (CDS people can help) Harvester: u Harvester programs are already available. 07/11/2002 Thomas Baron - JACo. W Workshop 11
CONCLUSION u CERN Library Wishes and Services u OAI Interface 07/11/2002 Thomas Baron - JACo. W Workshop 12
The End u http: //www. openarchives. org u http: //www. dublincore. org u http: //cds. cern. ch 07/11/2002 Thomas Baron - JACo. W Workshop 13
- Slides: 13