Dataverse for citing and sharing research data Ubuntu
Dataverse for citing and sharing research data Ubuntu. Net‐Connect 2018, 22 ‐ 23 November Zanzibar, Tanzania. Sonia Barbosa, Harvard University, USA Obiajulu Odu, Ui. T The Arctic University of Norway
What is Dataverse • Dataverse is an open source web application to share, preserve, cite, explore, and analyze research data. • It facilitates making data available to others, and allows scholars to replicate others' work more easily. • Researchers, journals, data authors, publishers, data distributors, and affiliated institutions all receive academic credit and web visibility. • Developed at Harvard University’s Institute for Quantitative Social Science (IQSS)
Some Dataverse Features • Data sharing and archiving with control and recognition for data producers • Support for all file types • Persistent data citations • Persistent IDs: DOI (Data. Cite) HDL (Handle. net) • Data restrictions • Customized branding
Some Dataverse Features • Rich data support for certain file formats • Tabular Data ingest supports the following file formats: • SPSS (POR and SAV formats), STATA, R, XLSX only, CSV (comma‐separated values) • Metadata extraction and subsetting • Metadata extraction for flexible image transport system (FITS) data • Subsetting for social network data (Graph. ML)
Some Dataverse Features • Data management, standards, and archival best practices • Data versioning • General and domain‐specific metadata following metadata standards • Traffic and downloads tracking with Guestbook feature • Permanent storage of data in preservation formats • Geographically distributed preservation copies
Some Dataverse Features • OAI‐‐‐PMH: Harvesting metadata (DC, DDI) • From other Dataverse installations • From other OAI‐DC compliant repositories • Shibboleth authentication • Oauth Login • ORCID • Git. Hub • Google
What is Dataverse and Dataset? Schematic Diagram of a Dataverse og Dataset in Dataverse 4. x Container for Datasets and/or Dataverses can contain other Dataverses Container for data, documentation, and code.
Simple Workflows
Data Citation Example
Some Organizational Models • Global: Harvard Dataverse • Consortium: The Texas Digital Library (TDL) is a consortium of higher education institutions in Texas. • National: Dataverse. No ‐ Institutes pay a fixed fee for participation in Dataverse. NO, to which storage costs for the data are added. Every participating institute determines its own policy. • Singel Institution, Journals Courses, Private archive
Structure at Participating Institution 1. University Library supports and trains researchers in data curation and data management plans 3. University Library reviews metadata, and see that DOI is allocated and dataset is in order 4. Controlled storage at IT‐Dept. – Trusted Digital Repository 2. The researchers curate their data during the project and deposit it into Dataverse University Library supports the researcher. Back‐ office support from the Library Persistent format in the archive
Plan and Activities • Training and workshops • Support services • Outreach strategies • Promotions • Infrastructure development and needs.
Adopted by institutions worldwide
Developers, Integration, interoperability • The Dataverse Development Community is an active group of internal and external contributors to the Dataverse software codebase. • Dataverse APIs cover almost every piece of functionality available to users in the UI. • You can integrate an existing application with Dataverse or use an API to interact with all the data in a Dataverse installation • Data explorer integration tool to visualize Dataverse DDI Metadata
Some Incentives and Benefits • Dataverse provides incentives for researchers to share: • Recognition & credit via data citations • Control over data & branding • Fulfill Data Management Plan requirements • Benefits for Researchers Using Dataverse • Safe and long‐term data storage in preservation format. • Allow users to download your data in any format and run many advanced statistical methods online.
Some Major Future Plans • Sensitive Data Support through Data. Tags • Embargo ‐ Researchers will be able to set an availability schedule for their data. • Preserve File Hierarchy ‐ Researchers can preserve a dataset's files' directory structure, for easy import, computation, and navigation. • Make Data Count Integration ‐ Dataverse will integrate with Make Data Count and report standardized usage metrics. • Global Dataverse Community Consortium
Thank you / Asante! Any questions / yoyote ya sporrs Contact: • Obiajulu Odu <obiajulu. odu@uit. no • Sonia Barbosa <sbarbosa@g. harvard. edu> Learn more: dataverse. org and Forum Try out Dataverse: demo. dataverse. org or test. dataverse. no
References • https: //dataverse. org/ • https: //dataverse. no/ • https: //dataverse. harvard. edu/ • https: //ils. unc. edu/digccurr/curategear 2015‐talks/crabtree. pdf • https: //dataverse. org/files/dataverseorg/files/dlfdataverse_quigley. p df • Forum: • https: //groups. google. com/forum/#!forum/dataverse‐community
- Slides: 19