The Digital Catch an integrative role for IAMSLIC

  • Slides: 58
Download presentation
The Digital Catch: an integrative role for IAMSLIC in the worlds of metadata, harvesters

The Digital Catch: an integrative role for IAMSLIC in the worlds of metadata, harvesters and repositories 1: Species Diversity Pauline Simpson, National Oceanography Centre, Southampton, UK 2: Aquatic Commons Stephanie Haas, Digital Library Centre, University of Florida, Gainsville USA Information for Responsible Fisheries: Libraries as Mediators 10 - 14 October 2005 IAMSLIC Rome 2005 Rome, Italy

The Digital Catch: an integrative role for IAMSLIC in the worlds of metadata, harvesters

The Digital Catch: an integrative role for IAMSLIC in the worlds of metadata, harvesters and repositories Part 1: Species Diversity Pauline Simpson National Oceanography Centre, Southampton, UK IAMSLIC Rome 2005

National Oceanography Centre, Southampton NOC is one of the world’s leading centres for research

National Oceanography Centre, Southampton NOC is one of the world’s leading centres for research and education in marine and earth sciences, for the development of marine technology and for the provision of large scale infrastructure and support for the marine research community University of Southampton Research-led multidisciplinary university: 20, 000 students 5000 staff (3000 researchers) IAMSLIC Rome 2005

Open Access to Research – Southampton a key player • 27 Jun 1994 Stevan

Open Access to Research – Southampton a key player • 27 Jun 1994 Stevan Harnad’s ‘Subversive Proposal’ leading to the open access vision for scholarly material. In an ideal world of scholarly communication – all research should be freely available • School of Electronic and Computer Science developed the first repository software EPrints. Org to support the vision and implemented their own Departmental Repository • Work on Citation analysis based on repository content, research reporting etc • National Oceanography Centre Southampton – early adopter and Project Manager established University of Southampton Research Repository. First in UK to move– from project to University funded service ……. . Fertile open access and repository background IAMSLIC Rome 2005

Sharing at IAMSLIC Conferences 2002 e-Prints and the Open Archive Initiative - opportunities for

Sharing at IAMSLIC Conferences 2002 e-Prints and the Open Archive Initiative - opportunities for libraries, presented by Pauline Simpson in, Bridging the Digital Divide, 28 th Annual IAMSLIC Conference, 611 Oct 2002, Mazatlan, Sinaloa, Mexico 2003 Institutional Repositories : an opportunity for IAMSLIC, presented by Pauline Simpson in, , Navigating the Shoals : evolving User Services in Aquatic and Marine Science Libraries, 29 th Annual IAMSLIC Conference, 5– 9 Oct 2003, Mystic, Connecticut USA 2004 The Culture, Care and Content of Institutional Repositories, presented by Pauline Simpson (contribution to Preserving our Institutional Intellectual Property: Panel Discussion) in, Voyages of Discovery, parting the seas of information technology, 30 th Annual IAMSLIC Conference, 6 -9 Sep 2004 , Hobart, Tasmania, IAMSLIC Rome 2005

Two Calls 1. Encourage IAMSLIC members to implement Institutional Repositories within their organizations to

Two Calls 1. Encourage IAMSLIC members to implement Institutional Repositories within their organizations to contribute to the provision of global Open Access to aquatic and marine science research IAMSLIC Rome 2005

Repository Development Ar. Xiv (from 1991 at Los Alamos now at Cornell) for high

Repository Development Ar. Xiv (from 1991 at Los Alamos now at Cornell) for high energy physics community (incl Atmospheric and Oceanic Physics, Math, Computing Science and Nonlinear Science). Despite success of Ar. Xiv and others - Re. PEc (Economics), Cogprints (Cognitive Psychology), Mathematics, etc – varying success by other subject communities (Chemistry Preprints Server now finished) 2000 onwards complementary implementation of Institutional repositories fuelled by project funding eg Mellon Foundation, Howard Hughes, JISC UK, Open Society Institute and powered by the Information Community IAMSLIC Rome 2005

Why it should be Institutional Repositories Institutions are logical implementers of repositories – Centralise

Why it should be Institutional Repositories Institutions are logical implementers of repositories – Centralise a distributed activity – Framework and Infrastructure – Permanence that can sustain changes – Stewardship of Digital assets – Preservation policy – Provide central digital showcase for the research, teaching and scholarship of the institution IAMSLIC Rome 2005 Subject or project repositories often linked to an individual or a group – can be transitory collection at risk

Repositories are spreading because … • • Supplementary to traditional publication Do not affect

Repositories are spreading because … • • Supplementary to traditional publication Do not affect current research publication processes Give easy and rapid access Give long-term access Increase readership and use of material – more citations They offer advantages to institutions They offer advantages to research funders They offer new ways for information to be linked and used IAMSLIC Rome 2005

Increasing number of Repositories • 2002 = 112 (TARDis Subject Categorization Survey) • 2005

Increasing number of Repositories • 2002 = 112 (TARDis Subject Categorization Survey) • 2005 = 466 (from Institutional Archives Registry) IAMSLIC Rome 2005

http: //archives. eprints. org/ IAMSLIC Rome 2005

http: //archives. eprints. org/ IAMSLIC Rome 2005

Truly global movement IAMSLIC Rome 2005

Truly global movement IAMSLIC Rome 2005

http: //www. opendoar. org/ Main Partners: Lund University Univ of Nottingham) IAMSLIC Rome 2005

http: //www. opendoar. org/ Main Partners: Lund University Univ of Nottingham) IAMSLIC Rome 2005

Increasing numbers – Repository choices • • • • Subject ar. Xiv, Cogprints, Re.

Increasing numbers – Repository choices • • • • Subject ar. Xiv, Cogprints, Re. PEC, Institutional – Southampton, Glasgow, Nottingham, MBA UK, WHOI National - DARE (all universities in the Netherlands), Scotland, National / Subject - ODINPub. Africa International - Internet Archive ‘Universal’, OAIster Regional - White Rose UK Consortia - SHERPA-LEAP (London E-prints Access Project) Funding Agency – NIH (Pub. Med), Wellcome Trust (UK Pub. Med), NERC Project - Public Knowledge Project EPrint Archive Conference - 11 th Joint Symposium on Neural Computation, May 15 2004 Personal – peer to peer, web pages etc Media Type - VCILT Learning Objects Repository, NTDL (Theses) Publisher – journal archives Data Repositories/Archives - NODC, BODC, DOD, JODC, BADC etc IAMSLIC Rome 2005

Dilemma for Researcher • Mandates from major funding agencies now require grantees to deposit

Dilemma for Researcher • Mandates from major funding agencies now require grantees to deposit research output in a ‘designated repository’ or ‘any’ • Where should the full text of their research be deposited? • Researcher wants to enter metadata and deposit only once • Situation at present – Duplicate keying metadata into repositories of choice – Harvesting, but harvester is not the choice of the depositor – Cannot target multiple repositories with one exercise • Does it matter where it is deposited since Google Scholar, Yahoo, Scopus , will pick it up wherever it is deposited? IAMSLIC Rome 2005

The Cavalry - Building on Repository Diversity • Contributing to The Knowledge Cycle Encompassing

The Cavalry - Building on Repository Diversity • Contributing to The Knowledge Cycle Encompassing experimentation, analysis, publication, research, learning – Joined up research – a hub linking text and data – An audit trail from whatever point of access IAMSLIC Rome 2005

Presentation services: subject, media-specific, data, commercial portals Resource discovery, linking, embedding Data creation /

Presentation services: subject, media-specific, data, commercial portals Resource discovery, linking, embedding Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Data analysis, transformation, mining, modelling Searching , harvesting, embedding Aggregator services Resource discovery, linking, embedding Learning object creation, re-use Harvesting metadata Research & e-Science workflows Deposit / selfarchiving Learning & Teaching workflows Repositories : institutional, e-prints, subject, data, learning objects Validation Publication From: Lyon : CNI - JISC SURF Conference, May 2005 IAMSLIC Rome 2005 Deposit / selfarchiving Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules Resource discovery, linking, embedding Peer-reviewed publications: journals, conference proceedings Validation Quality assurance bodies

Virtual Learning Environment Knowledge Cycle – linking IRs and Data Digital Library E-Scientists Grid

Virtual Learning Environment Knowledge Cycle – linking IRs and Data Digital Library E-Scientists Grid Technical Reports Preprints & Metadata E-Experimentation Publisher Holdings Graduate Students E-Scientists Reprints Peer. Reviewed Journal & Conference Papers Undergraduate Students Institutional Archive IAMSLIC Rome 2005 Local Web Certified Experimental Results & Analyses Data, Metadata & Ontologies 5 Entire E-Science Cycle Encompassing experimentation, analysis, publication, research, learning

CLADDIER Project (Citation, Location And Deposition in Discipline and Institutional Repositories) • The CLADDIER

CLADDIER Project (Citation, Location And Deposition in Discipline and Institutional Repositories) • The CLADDIER system will be a step on the road to a situation where (in this case, environmental) scientists will to be able to move seamlessly from information discovery (location), through acquisition to deposition of new material, with all the digital objects correctly identified and cited. The lessons learned will be of applicability for the relationships between other discipline based repositories and institutional repositories. IAMSLIC Rome 2005

Persistent identifiers Data Citations Automated Linking (png) IAMSLIC Rome 2005

Persistent identifiers Data Citations Automated Linking (png) IAMSLIC Rome 2005

A plus for researchers • • One outcome of CLADDIER Project Present OAI-PMH Harvesting

A plus for researchers • • One outcome of CLADDIER Project Present OAI-PMH Harvesting = ‘pull’ CLADDIER outcome = ‘push’ Enable researcher to deposit in one repository and choose to upload (push) the metadata to another repository of choice • Institutional Repository to Subject Repository/s of choice – redundancy of records does it matter? IAMSLIC Rome 2005

Two Calls 2. IAMSLIC to host a repository for those who did not have

Two Calls 2. IAMSLIC to host a repository for those who did not have the support to set up their own and a harvester service for aquatic and marine science providing discovery and location through a one search interface ……………. IAMSLIC Rome 2005

IAMSLIC Aquatic and Marine Science Repository and Harvester - early concept Marine Science Institutional

IAMSLIC Aquatic and Marine Science Repository and Harvester - early concept Marine Science Institutional e-Print repositories Depositor Regional e-Print Repository OAI-PMH Odin Pub. Africa IAMSLIC Marine Science e-Print Service Ar. Xiv (Atmos & Oceanic Physics) Harvester (General) User Search IAMSLIC Rome 2005

OAI-PMH Institutional Repositories Author Disciplinary Repositories (incl IAMSLIC) Peer-to-peer Repositories of every flavour Open

OAI-PMH Institutional Repositories Author Disciplinary Repositories (incl IAMSLIC) Peer-to-peer Repositories of every flavour Open repositories IAMSLIC Rome 2005 Interoperability Standards Content Aquatic Commons Repository ( value added services) Reader Linking (Z 39. 50 Library) Multimedia IAMSLIC as service provider.

The Digital Catch: an integrative role for IAMSLIC in the worlds of metadata, harvesters

The Digital Catch: an integrative role for IAMSLIC in the worlds of metadata, harvesters and repositories Part 2: Aquatic Commons Stephanie C. Haas Digital Library Center, University of Florida Libraries, Gainesville IAMSLIC Rome 2005

Aquatic Commons is a model for digital resource sharing between stakeholders in the marine/aquatic

Aquatic Commons is a model for digital resource sharing between stakeholders in the marine/aquatic information world. Its integrative architecture accommodates researchers and research institutions at all technological levels. The model includes repositories, harvesting functions, searchable database creation, and integration with IAMSLIC’s Z 39. 50 distributed library and the ASFA database. IAMSLIC Rome 2005

Special thanks is extended to the Florida Center for Library Automation (FCLA) for providing

Special thanks is extended to the Florida Center for Library Automation (FCLA) for providing technical expertise, computer hardware/software, and the programming to develop a proof-ofconcept model. IAMSLIC Rome 2005

AQUATIC COMMONS : the purpose Aquatic Commons is being developed to: 1) Create a

AQUATIC COMMONS : the purpose Aquatic Commons is being developed to: 1) Create a central metadata and digital document reservoir related to marine and aquatic science information worldwide. 2) Support IAMSLIC’s long term goal of helping researchers and the public freely access needed information. 3) 3) Integrate the efforts of the total community by harvesting metadata where available and by creating repository and harvesting opportunities where needed. IAMSLIC Rome 2005

Identified stakeholders in the development of the Aquatic Commons 1) Researchers and research institutions

Identified stakeholders in the development of the Aquatic Commons 1) Researchers and research institutions in the marine and aquatic sciences 2) UN, International, and National ASFA partners 3) CSA 4) FAO ASFA Secretariat 5) Other marine research agencies such as IOC, NOAA, etc. 6) IAMSLIC and its affiliated regional groups 7) Florida Center for Library Automation (FCLA) IAMSLIC Rome 2005

Aquatic Commons architecture consists of an integrated Open Archive Initiative (OAI)* System that includes:

Aquatic Commons architecture consists of an integrated Open Archive Initiative (OAI)* System that includes: a harvester, an OAI provider, a search interface, a database, and a zebra Z 39. 50 server. At production level, the system will be based on Open Access software and scalable to accommodate new repositories coming online. IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

Overview of Components Aquatic Commons is designed as an OAI integrated system that will

Overview of Components Aquatic Commons is designed as an OAI integrated system that will functionally: v Harvest and create a searchable database of OAI compliant metadata from extant repositories or OAI static repositories including the Aquatic repository developed as part of this model, and in turn v Serve OAI complaint metadata to other services. It will also create: v An Aquatic eprint Repository to house digital works and metadata created by researchers or institutions that don’t have stable IT support. IAMSLIC Rome 2005

and v. A zebra Z 39. 50 server that will interface with the IAMSLIC

and v. A zebra Z 39. 50 server that will interface with the IAMSLIC Z 39. 50 distributed library. OPTIONAL FUNCTIONALITY: Digital archiving at the FCLA Digital Archives of publications submitted to the Aquatic e-print Repository server. Metadata with links to documents can be harvested from the Aquatic Repository by CSA for inclusion in ASFA. IAMSLIC Rome 2005

Aquatic Commons architecture responsibilities: Harvest and create a searchable database of subject relevant OAI

Aquatic Commons architecture responsibilities: Harvest and create a searchable database of subject relevant OAI compliant metadata including a sample from the Aquatic eprint Repository Currently in test FCLA has harvested and made searchable metadata from six collections including the Aquatic eprint Repository developed as part of this model. FCLA has implemented a functional OAI static repository gateway to harvest metadata from OAI static repositories. IAMSLIC Rome 2005

FCLA is harvesting the following sites: Aquatic eprints Repository Baltic Marine Environment Bibliography 1970

FCLA is harvesting the following sites: Aquatic eprints Repository Baltic Marine Environment Bibliography 1970 W. M. Keck Laboratory of Hydraulics and Water Resources Technical Reports Oregon Institute of Marine Biology Woods Hole Oceanographic Institution ODINPub. AFRICA IAMSLIC Rome 2005

Database IAMSLIC Rome 2005

Database IAMSLIC Rome 2005

Database IAMSLIC Rome 2005

Database IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

OAI Static Repository OAI static repository records are records wrapped in XML and served

OAI Static Repository OAI static repository records are records wrapped in XML and served as a Web page at a persistent URL. They contain header information and metadata information. Currently in test FCLA is using a static gateway to broker records from an XML web page created at the University of Florida. Sample of one record <oai: record> <oai: header> <oai: identifier>oai: www. uflib. ufl. edu/digital/temporary/IA MSLIC. xml/00002</oai: identifier> <oai: datestamp>2005 -06 -03</oai: datestamp> </oai: header> IAMSLIC Rome 2005

<oai: metadata> <oai_dc: dc xmlns: oai_dc="http: //www. openarchives. org/OAI/2. 0/oai_dc/" xmlns: dc="http: //purl. org/dc/elements/1.

<oai: metadata> <oai_dc: dc xmlns: oai_dc="http: //www. openarchives. org/OAI/2. 0/oai_dc/" xmlns: dc="http: //purl. org/dc/elements/1. 1/" xmlns: xsi="http: //www. w 3. org/2001/XMLSchema-instance" xsi: schema. Location="http: //www. openarchives. org/OAI/2. 0/oai_dc/ http: //www. openarchives. org/OAI/2. 0/oai_dc. xsd"> <dc: creator>Brown, Mark T. </dc: creator> <dc: title>Successional development of forested wetlands on reclaimed phophate mined lands in Florida: Final report volume I </dc: title> <dc: subject>Phosphate mines and mining; Florida; wetlands </dc: subject> <dc: description>Prepared for Florida Institute of Phosphate Research, 1855 West Main Street, Bartow, Florida 33830 USA, Contact Manager: Steven G. Richardson, FIPR Project Numbers: 95 -03117 R and 98 -03 -131. </dc: description> <dc: description>Howard T. Odum Center for Wetlands. </dc: description> <dc: date>2002</dc: date> <dc: identifier>http: //purl. fcla. edu/fcla/tc/feol/UF 00015102. pdf</dc: identifier> </oai_dc: dc> </oai: metadata> </oai: record> IAMSLIC Rome 2005

Aquatic eprint Repository Currently in test Using open source eprint software from the University

Aquatic eprint Repository Currently in test Using open source eprint software from the University of Southampton, FCLA has set up a testbed for creating metadata and submitting documents by researchers and/or institutions without access to IT support. IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

Zebra Z 39. 50 server interfaces with IAMSLIC’s Z 39. 50 Library Gateway FCLA

Zebra Z 39. 50 server interfaces with IAMSLIC’s Z 39. 50 Library Gateway FCLA and IAMSLIC inputs: Steve Watkins will be working with FCLA to develop this functionality. Searches initiated in the IAMSLIC Library Gateway will be searching the Aquatic Commons database as well. IAMSLIC Rome 2005

SET UP COST ESTIMATES (year 1) Hardware / network Server, dual cpu, 4 GB

SET UP COST ESTIMATES (year 1) Hardware / network Server, dual cpu, 4 GB memory, 156 GB internal disk Tape cartridge for backup Software Red Hat Linux (OS) Tivoli (backup server) Tripwire (security) $ 5, 000 $ 200 $ $ $ 50 50 300 Staff Development and setup (320 hours) $ 4, 800 Total one-time costs $ 10, 400 IAMSLIC Rome 2005

ANNUAL ONGOING COST ESTIMATES (starting year 2) Hardware / network Server maintenance Network cost

ANNUAL ONGOING COST ESTIMATES (starting year 2) Hardware / network Server maintenance Network cost Software Red Hat Linux (OS) Tivoli (backup server) Tripwire (security) $ 500 $ 86 $ $ 50 50 $ 165 Staff Ongoing maintenance and support (20 hrs/mo) $ 3, 600 Total annual ongoing costs $ 4, 451 IAMSLIC Rome 2005

OPTIONS: DIGITAL ARCHIVING at FCLA of publications submitted to the Aquatic eprint Repository server.

OPTIONS: DIGITAL ARCHIVING at FCLA of publications submitted to the Aquatic eprint Repository server. FCLA has created one of the first “true” digital archives in the U. S. The FCLA Digital Archive may be found at http: //www. fcla. edu/digital. Archive/index. htm IAMSLIC Rome 2005

IAMSLIC Rome 2005

IAMSLIC Rome 2005

Third party service The University of Florida Libraries has the opportunity to develop collaborative

Third party service The University of Florida Libraries has the opportunity to develop collaborative agreements that extend digital archiving services to third parties. Formal agreements would be negotiated should this service be desired. IAMSLIC Rome 2005

OPTION: If FAO and CSA become collaborators on the Aquatic Commons, metadata from the

OPTION: If FAO and CSA become collaborators on the Aquatic Commons, metadata from the Aquatic e-print Repository could be harvested for inclusion in ASFA CSA inputs: Enhancing metadata to meet ASFA record standards IAMSLIC Rome 2005

If we accept the premise that most research papers are composed in an electronic

If we accept the premise that most research papers are composed in an electronic environment then they can be shared. Even ancient formats such as Word. Perfect can be converted and served as PDF files. If Internet access is unstable, files can be submitted on disk, CD, or DVD. Capture at creation is the most efficient means of sharing knowledge and assuring archival fidelity. IAMSLIC Rome 2005

There are many details to be worked out including: 1) Assurances that digital content

There are many details to be worked out including: 1) Assurances that digital content is available for open access without copyright infringement, 2) Defining relationships between participating organizations, 3) The technical aspects of the Aquatic eprint Repository including the handling of multi-language records and digital documents, and 4) Formal agreements with FCLA for its technical support of this initiative. IAMSLIC Rome 2005

In collaboration with others, IAMSLIC has the opportunity to create Aquatics Commons as an

In collaboration with others, IAMSLIC has the opportunity to create Aquatics Commons as an essential digital resource for those involved in all aspects of research and resource management. IAMSLIC Rome 2005