Building an institutional research data management infrastructure Damaro
Building an institutional research data management infrastructure Damaro: Data management rollout for Oxford Sally Rumsey The Bodleian Libraries University of Oxford
The need for data repositories • • • Research funders’ policies Few safe archival stores for data Disappearing data Access to other researchers’ data Journals demanding a citation to research data The value of publishing datasets for peer-review Print publications Timely responses to FOI requests Data are a manifestation of the intellectual property of the institution
4 strands Strand 1 Strand 2 Strand 3 Strand 4 Research data management policies Training, support and Guidance Technical development and Maintenance Business plan for Sustainability Named lead
Proposed architecture of University of Oxford’s modular research data infrastructure Policies Training Sustainability Data Management Planning Tool [DMPOnline project] Data governance & training Data Creation & local (dept) management Archival data storage and curation Data discovery and dissemination Document repository Institutional repository: ORA Data. Stage Ontologies SWORD Data. Bank DOI assigned Ontologies Vi. Daa. S Oxford local Data Stores Oxford Data. Finder DOI assigning Data. Cite Kernal minimum CERIF compliant Ontologies Software Store Local (dept) Data. Finder Lab. Trove External environment Ingesting and exposing metadata Internal environment External Data Stores Regional Data. Finder Colwiz
Federated data repositories ‘Live’ data repository National data service Subject data repository Cloud storage OUCS HFS back-up Local provision Vast files Oxford Data. Bank Trusted local storage at Oxford [Archive] etc
Data. Finder The keystone of Oxford’s Research data infrastructure Catalogue/Registry Metadata only Dissemination Discovery Citation Location Irrespective of format Compliance with funder requirements • Explanatory • Reporting & business intelligence • •
Data. Finder construction • • Underlying technologies Metadata import and export Metadata agnostic Relationship to Data. Bank www. flickr. com/photos/nickobec/359440072/
Neil Jefferies, The Bodleian Libraries, 2012
Populating Data. Finder Sources of metadata Manual entry • Generally disliked • Can lead to inaccuracies • Can lead to richer metadata Import existing • Not much exists • From data repositories (eg UKDA, Dryad) • From central systems (eg RIM or DMP systems) • From other systems (eg ROS) • From machines that generate the data Auto generated • From DMP systems • From Data. Stage Minimum metadata set • Mandatory • Contextual • Optional – including disciplinary
Recommended minimum core metadata set for Oxford [WIP] Element Auto Gen Record/digital object ID Location of dataset [Medium] UUID URL/ DOI If no URL: contact details Data. Bank auto Web. Auth/Ox. DMP M Web. Auth/Ox. DMP Title Publication year M Default: digital (+ non-digital). Creator (if not depositor) Repeatable Creator affiliation (if not Repeatable (see depositor) optional) Publisher of data Data. Cite Note To enable indication of non-digital data. Check box + options. On/offline If depositor draw from Web. Auth. (see optional) If depositor draw from Web. Auth; CUD; Imply subject M Default University Default of Oxford Default M Default current M If an embargo period has been in effect, use the date when the embargo period ends. Access terms & conditions Default + options Data owner Access date to data Rights for metadata [Subject] Default Department Default current Default: CC 0? ODC? FAST + options Web. Auth/Ox. DMP For curation; ALT Name (Person or role) + Data owner contact. + Qu 'Do you own the rights for this data? Need policy To set embargo Import where possible using available data. Encourage imupt. + K/w option. See Optional
Damaro is not a cure-all • • A foundation Taking the long view Evolving systems Evolving services Emerging policies Emerging needs Changing practice 31 st March 2013 ✔
Conclusions: Just enough… • • • Services Metadata To meet immediate needs To be modular and flexible To be reactive to change … to get the new data-focused environment started at Oxford
- Slides: 12