FAIR Research Objects Maryann Martone Research objects Research
FAIR Research Objects Maryann Martone
Research objects Research Objects, an emerging approach to the publication, and exchange of scholarly information on the Web. Research Objects aim to improve reuse and reproducibility by: ● Supporting the publication of more than just PDFs, making data, code, and other resources first class citizens of scholarship ● Recognizing that there is often a need to publish collections of these resources together as one shareable, cite-able resource. ● Enriching these resources and collections with any and all additional information required to make research reusable, and reproducible! Research objects are not just data, not just collections, but any digital resource that aims to go beyond the PDF for scholarly publishing! http: //www. researchobject. org/
Here: Research objects = digital assets and don’t refer to any specific protocol for sharing them
Why Research Objects? • “Manifests of our intellectual output”-Bjorn Brembs • Many funders and journals are requiring publication of data and code • Systems for citing data and software being implemented • Just as with narrative works, infrastructure, conventions and evaluation systems are required for publishing data and code and other types of output
Publish. . . • to make generally known • to make public announcement of to disseminate to the public to produce or release for distribution; specifically : print 2 c to issue the work of (an author) • • • https: //www. merriam-webster. com/dictionary/publish
The FAIR Guiding Principles for scientific data management and stewardship High level principles to make data: • Findable • Accessible • Interoperable • Re-usable …for humans and machines Mark D. Wilkinson et al. The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data (2016). DOI: 10. 1038/sdata. 2016. 18
Findable • F 1. (meta)data are assigned a globally unique and persistent identifier • F 2. data are described with rich metadata • F 3. metadata clearly and explicitly include the identifier of the data it describes • F 4. (meta)data are registered or indexed in a searchable resource 13% of data resources listed in NIF have moved location By Wikimedia Foundation http: //wikimediafoundation. org, CC BY-SA 3. 0,
Metadata • Data that provides information about other data • Descriptive metadata describes a resource for purposes such as discovery and identification. It can include elements such as title, abstract, author, and keywords. https: //en. wikipedia. org/wiki/Metadata
PIDs • A long-lasting reference to a document, file, web page, or other object • URL, DOI, URI, Accession Number • • • Typically, such an identifier is not only persistent but actionable: you can plug it into a web browser and be taken to the identified source. PMID: 27151636 (non-actionable) • http: //identifiers. org/pubmed/27151636 DOI: 10. 1016/j. neuron. 2016. 04. 030 • http: //dx. doi. org/10. 1016/j. neuron. 2016. 04. 030 Persistence is a social contract Globally unique vs locally unique https: //en. wikipedia. org/wiki/Persistent_identifier
Data sets can be given DOI’s and consistent metadata Metadata: Data Cite, Dublin Core, Schema. org Example: https: //dataverse. harvard. edu/dataset. xhtml? persistent. Id=doi: 10. 7910/DVN/2 XP 8 YF DOI: http: //dx. doi. org/10. 7910/DVN/2 XP 8 YF
Accessible • A 1. (meta)data are retrievable by their identifier using a standardized communications protocol • A 1. 1 the protocol is open, free, and universally implementable • A 1. 2 the protocol allows for an authentication and authorization procedure, where necessary • A 2. metadata are accessible, even when the data are no longer available • Meta. Gene. Profiler • Sci. Crunch Registry
Dead and Living Dead Resources • ~6% of data resources no longer available • ~20% no longer actively maintained
Machine-actionable metadata http: //www. cellimagelibrary. org/images/40654 Vs https: //dataverse. harvard. edu/dataset. xhtml? persistent. Id=doi: 10. 7910/DVN/2 XP 8 YF
Interoperable • I 1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. • I 2. (meta)data use vocabularies that follow FAIR principles • I 3. (meta)data include qualified references to other (meta)data
Some examples Database 1: Brain region = an Database 2: Brain region = amygdaloid nucleus (UBERON_0001876) Database 3: Brain region = amygdaloid nucleus (UBERON_0001876) (according to) Paxinos and Watson, 1998 (ISBN: 0125476191, 9780125476195)
Community standards • • • Gladiator at IMDB Schema. org Mons et al. , 2017 Journal article tag suite (JATS)
Re-usable • R 1. meta(data) are richly described with a plurality of accurate and relevant attributes • R 1. 1. (meta)data are released with a clear and accessible data usage license • R 1. 2. (meta)data are associated with detailed provenance • R 1. 3. (meta)data meet domain-relevant community standards ~30% of data resources listed in NIF do not have terms or use or license information
FAIR principles are • “…characteristics that contemporary data resources, tools, vocabularies and infrastructures should exhibit to assist discovery and reuse by third-parties”-Wilkinson et al. , 2016 • Recognized by major initiatives: EC-Elixir, H 2020, US National Institutes of Health and G 20 • “…we support appropriate efforts to promote open science and facilitate appropriate access to publicly funded research results on findable, accessible, interoperable and reusable (FAIR) principles”-G 20 Hangzhou Summit
“The recognition that computers must be capable of accessing a data publication autonomously, unaided by their human operators, is core to the FAIR Principles. Computers are now an inseparable companion Mons et al. , 2017 in every research endeavour”
FAIR principles are not… • A standard • Equal to RDF, Linked Data, or the Semantic Web • Equal to Open • Just for life sciences Good resource: https: //www. dtls. nl/fair-data/ Mons et al. , 2017
Why principles? “we don't want to re-invent the wheel” Principles of a wheel: It’s round and has to fit into a larger context
Enjoyable sailing Wild but navigable and connected: API’s, identifiers, norms FAIRports: Robust standards, API’s and tools Watch out! Proprietary formats, no curation, closed or restricted content, non-actionable format Zombie resources; broken links; no licenses
FAIR research ahead Research objects • High level principles to make data: • Findable • • Accessible: • • Interoperable tools that allow researchers to create customized workflows Re-usable • • Extend to people with disabilities; people outside of the academy Interoperable • • Unique identifiers for people, research resources and organizations “Open by default; open by design” Citable: • Formal systems for citing data and other research objects so that usage can be tracked and credited
FAIR Ecosystem Minimal Information Models CDE PID Metadata standards ORCID RRID Research resources Protocols People Repositories and Registries Ontology DOI Concepts Non-digital Translation Citation standards Aggregator e. g. NIF, NIH Data Discovery Index; Pub Med, Altmetrics Data Digital world runs on globally unique and persistent identifiers; PID’s serve as a “key” for identifying the same entity across different contexts
- Slides: 24