Machineactionable DMP workshop defining requirements and priorities in
Machine-actionable DMP workshop. defining requirements and priorities in South Africa Sarah Jones Digital Curation Centre, Glasgow sarah. jones@glasgow. ac. uk Twitter: @sj. DCC @DMPonline Pretoria, 1 August 2018
What problems are we trying to address? By Josiah Martin public domain
Planning & administration Create, analyse, manage data Publishing & reuse • DMP on periphery • Often done at grant stage and not looked at again • Opportunities to (re)use information being missed • Disconnected & unlinked
Information flow across systems To support: • Data discovery • Capacity planning • Aggregation/integration • Policy compliance • …. From Flickr by highwaysengland, CC BY 2. 0
A postcard from the future: Tools and services from a perfect DMP world 47 participants from 16 countries • Funders • Developers • Librarians • Service providers • Researchers IDCC, 20 Feb 2017 www. dcc. ac. uk/events/workshops/postcard-future-tools-and-services-perfect-dmp-world Photo by Rune Askeland on 500 px
Workshop aims and scope • Explore current practice on DMPs in South Africa • Understand your issues and hopes for change • Discuss opportunities for improvements • Define practical use-cases and next steps • Promote cross-stakeholder dialogue and planning
Agenda 8: 00 8: 05 8: 15 8: 45 9: 30 10: 00 11: 30 12: 15 13: 00 14: 00 15: 30 Welcome Workshop aims and scope Group introductions Introduction to Active DMPs Coffee Map where DMPs sit in research workflow Group discussion Overview of the DMPRoadmap project Lunch Defining use cases for machine-actionable DMPs Questions and close
Introduce yourself Tell us who you are, where you are from and answer: • Why are you motivated/excited/required to work on data management and DMPs? • What are your pain points? • What do you hope to get out of this workshop?
Overview to Active DMPs Sarah Jones | Digital Curation Centre | sarah. jones@glasgow. ac. uk Stephanie Simms | California Digital Library | stephanie. simms@ucop. edu Daniel Mietchen | National Institutes of Health | daniel. mietchen@nih. gov Tomasz Miksa| TU Wien | tmiksa@sba-research. org #Active. DMPs
Lots of activity in this space! https: //doi. org/10. 5281/zenodo. 1174283 https: //activedmps. org
What is the vision? Transform static documents in active, machine-actionable DMPs that exchange data across systems to enable: • Researchers to manage, share and discover data more easily • Infrastructure providers to plan their resources • Institutions to provide effective data services • Funders to monitor data-related activities
How will this help? • Less manual completion • DMPs get updated, automatically where possible – collect information from systems – trigger actions in systems • Plans can be validated www. rd-alliance. org/system/files/documents/2018 -RDA-Plenary-Berlin. pdf
Planning & administration Identifiers: Create, analyse, manage data Publishing & reuse
Requirements & priorities CERN event on DMPs Workshop at IDCC on a ‘perfect DMP world’ RDA working group Github user stories
User stories and classification As a <stakeholder>, I want <goal> so that <reason> 3 major categories (colours) stakeholders involved project phase subject of information conveyed https: //github. com/RDA-DMP-Common/user-stories/wiki 23/10/2021 www. rd-alliance. org - @resdatall CC BY-SA 4. 0
User story visualisation https: //goo. gl/zn. BL 3 F interactive visualisation - changes on Git. Hub are visible immediately shows relations between stakeholders, phases and information • Most user stories associated with planning phase • Repositories most common stakeholder • Reuse and metadata most common themes 23/10/2021 www. rd-alliance. org - @resdatall CC BY-SA 4. 0
What H 2020 DMP users prioritised Marjan Grootveld, Ellen Leenarts, Sarah Jones, Emilie Hermans, & Eliane Fankhauser. (2018). Open. AIRE and FAIR Data Expert Group survey about Horizon 2020 template for Data Management Plans (Version 1. 0. 0) [Data set]. Zenodo. http: //doi. org/10. 5281/zenodo. 1120245
Emerging papers and results Simms S, Jones S, Mietchen D, Miksa T (2017) Machine-actionable data management plans (ma. DMPs). Research Ideas and Outcomes 3: e 13086. https: //doi. org/10. 3897/rio. 3. e 13086 Tomasz Miksa, Peter Neish, Paul Walk, & Andreas Rauber. (2018). Defining requirements for machine-actionable Data Management Plans (Version preprint). Zenodo. http: //doi. org/10. 5281/zenodo. 1266211
ma. DMP priority areas • Common standards and protocols • Leveraging PIDs for automatic reporting etc • Capacity planning (institutional & data centre) • Increasing data discovery & reuse • Supporting evaluation & monitoring • Share/publish/deposit DMPs
Where are people engaging? Active DMPs Interest Group FAIR DMP Working Group Spun out two WGs Intend to aggregate ideas across communities and national contexts to distil common principles – DMP Common Standards – Exposing DMPs Next plenary in Gaborone on 5 -8 November 2018 https: //www. rd-alliance. org Next meeting in Montreal on 11 -12 October 2018 https: //www. force 11. org
DMP Common Standards - Outputs Common data model for machine-actionable DMPs to model information from standard DMPs NOT a template NOT a questionnaire modular design core set of elements domain specific extensions Reference implementations ready to use models JSON, XML, RDF, etc. Guidelines for adoption of the common data model requirements for supporting systems pilot studies 23/10/2021 www. rd-alliance. org - @resdatall CC BY-SA 4. 0
Example • Current DMPs – model questionnaires <administrative_data> <question>Who will be the Principle Investigator? </question> <answer>The PI will be John Smith from our university. </answer> </administrative_data> • Machine-actionable DMPs – model information "dc: creator": [ { "foaf: name": "John Smith", "@id": "orcid. org/0000 -1111 -2222 -3333", "foaf: mbox": "mailto: jsmith@tuwien. ac. at", "madmp: institution": " AT-Vienna-University-of-Technology" } ], 23/10/2021 www. rd-alliance. org - @resdatall CC BY-SA 4. 0
Example • Currently available – not very useful <administrative_data> <question>Who will be the Principle Investigator? </question> <answer>The PI will be John Smith from our university. </answer> </administrative_data> Reuse existing standards, e. g. Dublin • Core, Machine-actionable PREMIS, etc. DMP "dc: creator": [ { "foaf: name": "John Smith", "@id": "orcid. org/0000 -1111 -2222 -3333", "foaf: mbox": "mailto: jsmith@tuwien. ac. at", "madmp: institution": "AT-Vienna-University-of-Technology" } ], 23/10/2021 www. rd-alliance. org - @resdatall CC BY-SA 4. 0
Example • Currently available – not very useful <administrative_data> <question>Who will be the Principle Investigator? </question> <answer>The PI will be John Smith from our university. </answer> </administrative_data> • Machine-actionable DMP Use PIDs whenever possible, e. g. ORCID "dc: creator": [ { "foaf: name": "John Smith", "@id": "orcid. org/0000 -1111 -2222 -3333", "foaf: mbox": "mailto: jsmith@tuwien. ac. at", "madmp: institution": "AT-Vienna-University-of-Technology" } ], 23/10/2021 www. rd-alliance. org - @resdatall CC BY-SA 4. 0
Example • Currently available – not very useful <administrative_data> <question>Who will be the Principle Investigator? </question> <answer>The PI will be John Smith from our university. </answer> </administrative_data> • Machine-actionable DMP "dc: creator": [ { Use controlled "foaf: name": "John Smith", vocabularies "@id": "orcid. org/0000 -1111 -2222 -3333", "foaf: mbox": "mailto: jsmith@tuwien. ac. at", "madmp: institution": "AT-Vienna-University-of-Technology" } ], 23/10/2021 www. rd-alliance. org - @resdatall CC BY-SA 4. 0
Example • Currently available – not very useful <administrative_data> <question>Who will be the Principle Investigator? </question> <answer>The PI will be John Smith from our university. </answer> </administrative_data> • Machine-actionable DMP Develop own concepts and vocabularies only when needed "dc: creator": [ { "foaf: name": "John Smith", "@id": "orcid. org/0000 -1111 -2222 -3333", "foaf: mbox": "mailto: jsmith@tuwien. ac. at", "madmp: institution": "AT-Vienna-University-of-Technology" } ], 23/10/2021 www. rd-alliance. org - @resdatall CC BY-SA 4. 0
DMP workflows and stakeholders 23/10/2021 www. rd-alliance. org - @resdatall CC BY-SA 4. 0
Next steps 1 st consultation (user stories) went broad helped us defined the scope of the ma. DMPs what information should a ma. DMP contain? who provides and uses this information? 2 nd consultation will go deep how do we model specific requirements which specific fields are needed? which models exist? 23/10/2021 www. rd-alliance. org - @resdatall CC BY-SA 4. 0
Persistent identifiers (PIDs) Assign a DOI to DMP of record. Use this to get award details back into a DMP and link up outputs. Leverage other PIDs to populate DMP over time: • Researcher IDs (ORCIDs) • Funder IDs (Fund. Ref) • Grant IDs • Research Resource IDs (RRIDs) – antibodies, organisms, cell lines http: //pidapalooza. org Also enables compliance monitoring
Institutional use cases I. Connect researchers to relevant services & support II. Gather information to forecast demand do capacity planning III. Embed DMP in research process (domain workflows, ethicals, admin systems) William Murphy. CC BY-SA 4. 0
Repository use cases I. Repository recommender service via re 3 data. org II. Text mine to ping repositories when mentioned in a DMP III. Use DMP as metadata to facilitate deposit process IV. Deposit DMPs with data Ainsley Seago. CC BY 4. 0
Evaluation & monitoring Automated compliance checks • did researchers do what they said they would? Quality or validation checks • closed questions / range of defined options • training and evaluation rubrics • evaluate FAIRness of data and repository…
Summary Think of DMPs as key elements of a networked data management ecosystem: • connected via a shared vocabulary • actionable by humans and software • versioned • public From Flickr by highwaysengland, CC BY 2. 0
Where do DMPs sit in the lifecycle? 1. Map typical activities in research workflows (40 mins) – Be as creative as you can, consider all stages, go into detail on what happens and who is involved 2. Highlight tools and systems that interface with DMPs and map onto the workflow (20 mins) 3. Consider which data could be fed automatically from these systems into DMPs or vice versa (30 mins) – What information is held in each which could be useful? Be precise. What changes could trigger actions?
Group discussion • Describe / share the research workflows and where DMPs fit into them • Which systems / services are relevant in the workflow and should connect with DMP tools? • What can be automated and what will always have to be completed manually? • How can we get more value from DMPs? (e. g. by open publishing, package DMPs together with data etc)
- Slides: 35