Wheat Data Interoperability Esther DZALE YEUMO KABORE Richard
Wheat Data Interoperability Esther DZALE YEUMO KABORE Richard FULSS
Wheat Data Interoperability § Working Group Focus: § The WG aims to provide a common framework for describing, representing linking and publishing Wheat data with respect to open standards. § The WG will focus first on the following data types: SNP, Genomic annotations, Phenotypes, Genetic Maps, Physical Maps, Germplasm, expression data. 2
Wheat Data Interoperability 3 § Working Group Deliverables: § A report on the existing resources (vocabularies, ontologies, data formats, metadata standards). Started § A cookbook intended for the Wheat data managers community: guidelines on metadata, vocabularies and ontologies plus a decision tree based on data and metadata description recommendations and file format recommendations. Started § Library of linked vocabularies and ontologies in the with respect to the Linked Data standards § A prototype. The goal of the prototype is to propose a ready to use and streamlined framework for: § integration of heterogeneous Wheat Data, § publishing Wheat Linked Data that facilitate the reuse of the mashed up data by programs and Humans.
Survey on Wheat standards § Survey on Wheat standards: § from April 7 to June 3, 2014 § replies from 196 individuals in at least 31 different countries § More than 50% of the respondents have not yet established data management guidelines. 4
Survey results – The respondents (1) 5
Survey results – The respondents (2) 6 Do you use data produced in other laboratories/institutions 3% sometimes (76) 24% 42% very often (55) No answer (43) never (5) 31%
Survey results – Data storage practices 7 114 of the 196 respondents currently store their data on local drives; 84 are willing to use shared databases and repositories. Data storage 120 100 114 80 Currently 84 60 71 Wanted 74 73 64 40 48 20 43 5 Shared databases Other 0 Files in a local Files in a drive shared drive Local databases 5
Survey results – Important data types 8 Phenotypes, SNPs, genomic annotations, germplasm, genetic maps, and physical maps were listed as the most important data types over the next five years. Only 24 of the 196 respondents mentioned other data types.
Survey results – data formats (1) 9
Survey results – Ontologies Why not ? • Lack of knowledge (don’t know, too difficult, etc. ) • Lack of trust (lots of talk about their development, but little/no implementation, no agreement, standards, incomplete) • Lack of interest (not useful) • In progress. • No need/required/relevant • Too complicated • Not available Other ontologies mentioned are: • ECPGR • Ontologies to develop conceptual ABM • PATO, XEML • Plant Environmental conditions ontology • plant pathogens: : http: //www. pathoplant. de/; PLEXdb; • QUDT 10
Survey results – Metadata standards Why not ? • Not applicable • Not known (majority) • No need • Just started • Lots of talk about their development, but little/no implementation. • No capacity. • No existing metadata standard for phenotyping data • Too complex • Not a priority Other metadata standards and tools are: • Ag. MIP/ICASA • FCDC • ICASA • Phenotypic metadata standards developed in Genesys • PODD science ontology 11
How do we move on (1) 12 § Workshop (1 -2 October in Versailles): § Objectives: provide guidance to the Wheat Data Interoperability Working Group on § which priorities should be given to data types with non standardized data formats (according to the Survey) § what existing use cases can be used for showcasing the gain of interoperability that linked data can bring § discuss and adjust the draft of the cookbook § prepare a "White Paper" to publicize and communicate the recommended guidelines § Expected outcomes: § § § List of recommended standards for each data type List of standards to develop List of interoperability use cases Cookbook revised and adjusted Draft “White Paper” for publicizing and communicating the recommended guidelines
How do we move on (2) § Prototype and Library of lnked open vocabularies: § We need volunteers with expertise in § Software developing § Linked data § Metadata and vocabularies § Web site: § Wheatdata. org? 13
- Slides: 13