Introduction to Open Access and Institutional Repositories Hussein

  • Slides: 49
Download presentation
Introduction to Open Access and Institutional Repositories Hussein Suleman hussein@cs. uct. ac. za University

Introduction to Open Access and Institutional Repositories Hussein Suleman hussein@cs. uct. ac. za University of Cape Town October 2004

Discovering Research 1/3

Discovering Research 1/3

Discovering Research 2/3

Discovering Research 2/3

Discovering Research 3/3

Discovering Research 3/3

Overview Open Access p Institutional Repositories p Self Archiving p Copyright and publishing p

Overview Open Access p Institutional Repositories p Self Archiving p Copyright and publishing p Software issues p The UCT-CS archive p

What is Open Access p Open Access implies that any member of the public

What is Open Access p Open Access implies that any member of the public can get unhindered access to digital versions of publications. p Key Aspects: n n n No (Low? ) cost. No access restrictions. High quality of publications.

Traditional Open Access p Documents on an academic’s Web page. p Problems: n n

Traditional Open Access p Documents on an academic’s Web page. p Problems: n n Persistence – will the documents always be there? Authority – can we trust the authenticity of the website? Standards – are the formats and metadata the same across all websites? Discovery – how do we find these documents? Google?

Early Digital Libraries Digital libraries are repositories/archives that aim to address all previous OA

Early Digital Libraries Digital libraries are repositories/archives that aim to address all previous OA problems using software for management of documents. p In the 90’s subject archives were popular e. g. , ar. Xiv, Re. PEc, NCSTRL. p Problems: p n n n Sustainability – repositories run by organisations with limited funding. Third party effect – researchers had to be led to archives to submit/discover. Should the archive be centralised or distributed?

Open Archives Initiative (OAI) was formed to solve problems of archive interoperability – i.

Open Archives Initiative (OAI) was formed to solve problems of archive interoperability – i. e. , how they can work together. p In 1999 -2002, OAI developed the Protocol for Metadata Harvesting (PMH). p OAI-PMH is a low-cost interoperability protocol. p n n Any archive that supports this can exchange information with other archives. This makes it possible for archives to be connected together so collections of documents are no longer isolated.

Budapest Open Access Initiative 1/2

Budapest Open Access Initiative 1/2

Budapest Open Access Initiative 2/2 Towards end of 2001, the BOAI was launched to

Budapest Open Access Initiative 2/2 Towards end of 2001, the BOAI was launched to assist with making academic literature freely available. p BOAI is an advocacy and support initiative. p Linked to the Open Society Institute for funding. p Open Access usually refers to: p n n p Open Access Journals Institutional Repositories http: //www. soros. org/openaccess/

Bio. Med Central p http: //www. biomedcentral. com/

Bio. Med Central p http: //www. biomedcentral. com/

Open Access Journals do not charge users for access to documents. p Funded by:

Open Access Journals do not charge users for access to documents. p Funded by: p n n Donations Page costs paid by authors Peer review procedure is exactly the same as other journals. p There is no profit motive – OAJs are usually run by scholarly societies/groups. p

Institutional Repositories p Institutional Repositories (IR) are digital libraries run by an educational/research institution

Institutional Repositories p Institutional Repositories (IR) are digital libraries run by an educational/research institution to archive documents owned/produced locally. p Types of institutional repositories: n n n Departmental Special Collections Centralised Departmental Federated University Federated

The Departmental Archive

The Departmental Archive

Special Collections

Special Collections

Centralised IR

Centralised IR

Federated IR p Federation across Departmental IRs n p Connected through the OAI-PMH so

Federated IR p Federation across Departmental IRs n p Connected through the OAI-PMH so discovery can take place in one place (“campus portal”) but each department runs its own archive. Federation across Universities n n Connected through OAI-PMH so discovery can take place in central location e. g. , Google. Example: Networked Digital Library of Theses and Dissertations (NDLTD)

Alphabet Soup p OAI – Open Archives Initiative n p BOAI - Budapest Open

Alphabet Soup p OAI – Open Archives Initiative n p BOAI - Budapest Open Access Initiative n p Devised OAI-PMH, standard for connecting archives Supports Open Access OA – Open Access or Open Archive n n Open Access: free unhindered access to research Open Archive: supports OAI-PMH OAJ – Open Access Journal p IR – Institutional Repository p

Self-archiving p Self Archiving means taking control of and responsibility for the preservation and

Self-archiving p Self Archiving means taking control of and responsibility for the preservation and access to your research publications. n n p Traditionally, by adding documents to your website. Recently, by adding documents to an IR at your institution. Self Archiving is the best way to fill IRs n n Authors understand their documents best. Cost is lower than centralised archiving.

Why Self-Archive? Take ownership of your research! p Easier access for collaborators (“reprints” are

Why Self-Archive? Take ownership of your research! p Easier access for collaborators (“reprints” are dead). p National/regional/institutional rules and laws. p Greater visibility to research. p Can provide access even if university does not subscribe to journals. p Complete view of individual research output. p

Why Self-Archive in an IR? Takes away burden to maintain website. p Professional support

Why Self-Archive in an IR? Takes away burden to maintain website. p Professional support from library. p Much better reliability p n backup, migration, quality of service, etc. Provides consistent and standardised formats/metadata. p Institutions may require it. p Complete view of institutional research output. p

Issues: Publication and Pre-Prints p If we put pre-publication documents into an IR, does

Issues: Publication and Pre-Prints p If we put pre-publication documents into an IR, does this affect publication? p Generally, NO. Why? n n n Computer Scientists and Physicists have done this for decades with “ technical reports”. The version in the archive is (often substantially) different from the reviewed and published version. Theses and dissertations are not considered prepublication by publishers.

Issues: Copyright and Post-Prints p If we deposit post-publication documents into an IR, doesn’t

Issues: Copyright and Post-Prints p If we deposit post-publication documents into an IR, doesn’t this violate copyright? p Generally, NO. Why? n n Most society publishers will allow archiving on a website or IR e. g. , ACM Most commercial publishers allow archiving on a website or IR after some time (typically 12 -24 months). Newer commercial publisher agreements make greater allowance for IRs. You can always negotiate with a publisher!

Issues: Publishers and Government Commercial publishers require copyright transfer - Open Access publishers do

Issues: Publishers and Government Commercial publishers require copyright transfer - Open Access publishers do not. p Some governments are mandating OA for research: p n n n p UK has passed law already. US is considering a law. Many governments have laws regarding theses. Moral: Commercial publishers have to adapt – exclusive copyright transfer will not work if governments do not allow it!

Software Issues 1/2 p What do you need to contribute to or access an

Software Issues 1/2 p What do you need to contribute to or access an institutional repository? Web Browser p To contribute: maybe a way to create PDFs p n n Adobe Acrobat. Open Source and Freeware software available!

Software Issues 2/2 p Free software available to create an IR – Open Society

Software Issues 2/2 p Free software available to create an IR – Open Society Institute maintains a list: n All packages support OAI-PMH – they can be connected to other systems. EPrints p DSpace p Etc. p

EPrints

EPrints

DSpace

DSpace

p (this slide intentionally blank)

p (this slide intentionally blank)

The UCT-CS Repository Author self-submission þ Checking of submissions þ Archive-everything! þ UCT-CS-specific metadata

The UCT-CS Repository Author self-submission þ Checking of submissions þ Archive-everything! þ UCT-CS-specific metadata and classification systems þ Hierarchical browsing þ Simple and fielded searching þ OAI-PMH compliance þ

Open Access If it can’t be found in Google … p 1299 hits directly

Open Access If it can’t be found in Google … p 1299 hits directly from Google in June 2004. p Example: p n n http: //www. google. com/search? q=questionnaire+s ystem+UML Kritzinger, Pieter, Marshini Chetty, Jesse Landman, Michael Marconi and Oksana Ryndina (2003) Chatta. Box: A Case Study in Using UML and SDL for Engineering Concurrent Communicating Software Systems. In Proceedings Southern African Telecommunications Networks and Applications Conference, George, South Africa.

Why we have a repository It was faster than simply waiting! p CS departments

Why we have a repository It was faster than simply waiting! p CS departments internationally archive technical reports (NCSTRL). p Research websites don’t last long (enough). p UCT doesn’t have an ETD project yet. p We need to improve ACCESS to our work. p We need to preserve our research output. p Bureaucracy (UCT, NRF, Do. E, etc. ) requires tracking publications. p We (think we) know what we are doing. p

What we archive Books and Book Chapters p Conference Paper and Posters p Journals

What we archive Books and Book Chapters p Conference Paper and Posters p Journals (online and paginated) p Newspaper and Magazine Articles p Preprints p Presentation Slides p Conference Proceedings p Departmental Technical Reports p Electronic Theses and Dissertations p Other Stuff … p

Infrastructure Requirements p Software: EPrints v 2. 2. 1 n p Server: n n

Infrastructure Requirements p Software: EPrints v 2. 2. 1 n p Server: n n p Free. BSD 5. 0 Web server: n p Tacked onto an existing machine! 2 GHz Pentium/512 MB/40 GB Operating System: n p plus a few changes here and there. Apache v 1. 3. 7 Administrator: shared with other systems …

Community Building p Filling the archive: n n n Get official support. Twist arms

Community Building p Filling the archive: n n n Get official support. Twist arms of staff. Fill archive with own publications to make others look bad. Twist arms of staff even harder. Get (student) researchers to twist student arms. “The domino effect”.

Copyright

Copyright

Metadata/Citation Rendering

Metadata/Citation Rendering

Academic Overview of Research

Academic Overview of Research

NRF / Do. E Credit We already have a departmental listing of all research

NRF / Do. E Credit We already have a departmental listing of all research output. p Where copyright does not allow, we include just a citation – no files – for completeness. p

Interoperability Our archive is compliant with Open Archives Initiative’s Protocol for Metadata Harvesting (OAI-PMH)

Interoperability Our archive is compliant with Open Archives Initiative’s Protocol for Metadata Harvesting (OAI-PMH) v 2. 0. p Metadata can be freely harvested by any service provider. p base. URL: http: //pubs. cs. uct. ac. za/perl/oai 2 p

Communities and Metadata p Participate in OAI: n p Participate in NDLTD: n n

Communities and Metadata p Participate in OAI: n p Participate in NDLTD: n n p Metadata can be in Dublin Core. Metadata can be in ETDMS. Set for theses and dissertations only. Participate in NCSTRL: n n n Metadata can be in RFC 1807. Set for technical reports only. OAI-PMH Request: p http: //pubs. cs. uct. ac. za/perl/oai 2? verb=List. Records&meta data. Prefix=oai_rfc 1807&set=747970653 D 7465636872657 06 F 7274

Statistics 1/2 p Unique IP address accesses

Statistics 1/2 p Unique IP address accesses

Statistics 2/2 p Unique resource file (. pdf) accesses

Statistics 2/2 p Unique resource file (. pdf) accesses

Migration http: //pubs. cs. uct. ac. za is the “public view” p http: //pubs.

Migration http: //pubs. cs. uct. ac. za is the “public view” p http: //pubs. cs. uct. ac. za: 1081 is the actual server. p Apache rewriting rules are used to proxy to the actual server. p Advantages: p n n n Migration is trivial - we can move the server and nobody will know. All resources have the most generic URL possible. The repository can be co-located with other projects.

Research p Import metadata/files into DSpace n n n p Student assignment to migrate

Research p Import metadata/files into DSpace n n n p Student assignment to migrate metadata/content. Based completely on OAI-PMH interface. All 15 groups replicated basic EPrints functionality in DSpace with same data set. Higher-level services to enhance basic services provided by EPrints. n Ongoing work into component-based digital libraries…

Future Plans p Move to a dedicated server n p when the university decides

Future Plans p Move to a dedicated server n p when the university decides it is important enough. Link into an institutional repository system n when it is set up. Until such time, continue self-archiving in the belief that institutional repositories can be built “bottom-up”. p Do it on a small scale, reap the benefits and demonstrate the potential! p

Links p Open Archives Initiative n p Budapest Open Access Initiative n p http:

Links p Open Archives Initiative n p Budapest Open Access Initiative n p http: //pubs. cs. uct. ac. za EPrints n p http: //www. biomedcentral. com/ UCT CS Research Archive n p http: //www. soros. org/openaccess/ Bio. Med Central n p http: //www. openarchives. org/ http: //www. eprints. org/ DSpace n http: //www. dspace. org/

That’s all Folks! direct all comments to: hussein@cs. uct. ac. za

That’s all Folks! direct all comments to: hussein@cs. uct. ac. za