Biodiversity Data Integration IG Core Data Resources and
Biodiversity Data Integration IG Core Data Resources and FAIRification of Data Presentation for Joint meeting: IG ELIXIR Bridging Force, IG Biodiversity Data Integration, WG Bio. Sharing Registry RDA P 11 Berlin 2018 Wouter Addink Di. SSCo Coordination team member
Biodiversity Data Integration IG From data. . to knowledge?
Atlas of L Living Australia Moving away from siloed data Catalogue of L Life Global Biodiversity Knowledgebase Barcode of L Life Biodiversity L Heritage Library L GBIF L i. Dig. Bio Encyclopedia of L Life L Treebase
Biodiversity Data Integration IG Hot topics in the Biodiversity Data community A. H. Ariño et al: TDWG Now and Then, TDWG , Costa Rica, 7 -XII-2016
Biodiversity Data Integration IG UN Sustainable Development Goals Required: • High quality integrated data and services • Coordinated strategy
Biodiversity Data Integration IG Linking dispersed information is imperative Example– Invasive Alien Species UN Sustainable Development Goals (Target 15. 8) Economic costs of IAS for EU Urgent challenge Species distribution & genomics Modelling / Prevention / Early detection Facilities & information • • Linked Data • Analysis / Interpretation Services Climate data Ecological monitoring data Genomic information Other Research Infrastructures Institutional collections € 20 Billion / year Kettunen et al. 2009
Biodiversity Data Integration IG RI landscape for linking biodiversity information DATA MEASUREMENTS MODELLING ENVRI plus, 2017 RIs providing data on external factors Integrative RIs Species/ organisms observatories System observatories Experiments Modelling / Prevention / Early detection Species distribution & genomics Biodiversity standards / Reference data Taxonomic backbone Institutional collections Alien Invasive species use case
Alignment of Projects for Biodiversity Data Integration IG effective RI development - Di. SSCo example ICEDIG € 3 M | 2018 -2020 Co. L+ € Di. SSCo Design Study € 0. 5 M | 2017 - 2020 € € 10 M | 2014 - 2017 SYNTHESYS+ € 10 M | 2019 - 2021 € Di. SSCo Deploy € € 2 M | 2024 - 2025 € € Di. SSCo Construct € 53 M | 2021 - 2024 MOBILISE € 0. 5 M | 2018 - 2022 € Di. SSCo Prepare € 20 M | 2019 - 2023 SAP Strategic Alignment of Projects
Di. SSCo: A new European infrastructure 114 National Facilities 21 Countries • Largest ever formal agreement between natural science collection facilities • Centralised governance model already in place • Supporting network of working groups • Di. SSCo builds on top of a mature community of institutions • Strategic collaboration already underpinned by sound governance and decision-making structures
Biodiversity Data Integration IG Challenges in the Biodiversity Data domain • Accelerate generation and linking of information into research data objects • Ensure provenance and quality • Provide reliable, unified, certified services and harmonised policies • Provide services to other Research Infrastructures • Connect publishing and use
Biodiversity Data Integration IG Stakeholders in FAIRification of data Specification of core cloud services | Service Level Agreements Recommendations - specifications | Knowledge exchange e-Infrastructures Standardisation bodies New community data Standards Technical communities Research Infrastructures User requirements | Systems interoperability Data, workflow and systems integrity FAIR principles
Linking Biodiversity Data & Core data resources Catalogue of Life Plus project: • Species names • DNA Barcoding Joint development of a practical, community-based approach to rapid completion of a Global Taxonomic backbone: - (Re-)connects taxonomic research with specimen data - Quality control and enhanced linkages - Contribution of taxonomic expertise through a clearinghouse i. BOL – International Barcoded of Life project: The International Barcode of Life Project (i. BOL) is the largest biodiversity genomics initiative ever undertaken, to create a digital identification system for life. • Specimen identifiers CETAF Identifiers initiative: a joint Linked Open Data (LOD) compliant identifier system developed by the CETAF Information Science and Technology Committee (ISTC) providing mechanisms for consistently referencing individual specimens • Literature references BHL – Biodiversity Heritage Library: Collaboratively makes biodiversity literature openly available to the world
Biodiversity Data Integration IG FAIRification process adopted by GO FAIR Steps: • Retrieve non-FAIR data • Analyse the retrieved data • Define the semantic model • Make data linkable • Assign license • Define metadata for the dataset • Deploy FAIR data resource
Biodiversity Data Integration IG Some issues for FAIRification of Biodiversity Data • No infrastructure yet for sensitive biodiversity data • No standard ontologies • Semantic Web and Linked Data technologies not widely used in community • No common standard for metadata and current standards incomplete for giving attribution for the maintenance, curation, and digitization of collections. (RDA / TDWG Metadata Standards WG is working on this)
The need for taxon concept identifiers From: The use and limits of scientific names in biological informatics D. Remsen http: //zookeys. pensoft. net /articles. php? id=6234
Data classes in the biodiversity data domain Sequence Occurrence Taxon Interaction Gene Collection Publication Taxon Name Specimen Taxon Concept Trait
Relations in occurrence data Record <Class=Occurrence> Meta-model interpretation BRA: UFPB: JPB: 0000061643 Observer Place Soares Neto, RL included. In João Pessoa observed. By Occurrence from. Event included. In Paralba Place at. Locality <Unnamed> 20 Jan 2016 BRA: UFPB: JPB: 000006164 3 Place Brasil Campus I da UFPB (7. 1375 S, 34. 84586 W) has. Evidence identified. As Specimen Taxon. Concept has. Name Taxon. Name <Species> 61643 Tarenaya spinosa included. In in. Collection has. Owner Collection Taxon. Concept <Genus> Institution JPB UFPB has. Custodian has. Name included. In Taxon. Concept <Family> Taxon. Name Tarenaya has. Name Taxon. Name Cleomaceae
- Slides: 17