Bio Data ptELIXIR PT A Biological Data e
Bio. Data. pt|ELIXIR PT: A Biological Data e -Infrastructure for Research and Innovation Ricardo Leite, Ana Portugal Melo, Cirenia Baldrich, Daniel Faria, Daniel Neves, João Cardoso Jornadas FCCN – 7 Maio 2019
Who is Bio. Data?
Who is Bio. Data@IGC? Filipa Almeida - Project Manager João Garcia - System’s Administrator Daniel Neves - User Support Officer/Tools Developer Ricardo Leite - Genomics and Bioinformatics Expert João Sousa - Compute Platform Coordinator Ana Portugal Melo - Executive Director Cirenia Baldrich - Software Developer Daniel Faria - Interoperability Expert Pedro Fernandes - Training Coordinator Miguel Cardoso - Training Assistant Beatriz Lima - Galaxy Trainee Henrique Costa - Shiny-R Trainee Bio. Data. pt|ELIXIR PT
What is ELIXIR ? Intergovernmental organization: founded in 2014, 23 members over 180 research organisations. Brings together life science resources: databases, software tools, training materials, cloud storage and supercomputers. Aims to coordinate resources so that they form a single infrastructure, making it easier for scientists to find and share data, exchange expertise, and agree on best practices. https: //www. elixir-europe. org Bio. Data. pt|ELIXIR PT
Bio. Data. pt The Portuguese Node of ELIXIR Bio. Data. pt|ELIXIR PT
GTPB Platforms | Communities | Services https: //www. elixir-europe. org/services Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services • A sustainable infrastructure for storing, coordinating and distributing human data • Standardised tools to discover and access human data • • Local-EGAs for metadata sharing (European Genome-phenome Archive) Regulating access to sensitive data Long-term management policies for human data Ensures that human data in ELIXIR services is handled within the appropriate legal and ethical framework https: //www. elixir-europe. org/communities/human-data Bio. Data. pt|ELIXIR PT
Data Management in the Life Sciences Bio. Data. pt|ELIXIR PT João Cardoso
Platforms | Communities | Services Problem: ● Life science research produces huge quantities of data. ● It is crucial to make them Findable, Accessible, Interoperable and Reusable (FAIR) ● This data can be sensitive or classified. ● Data management of this data is a complex task that requires expert knowledge. Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services Moving towards: ● The Data Management Plan (DMP) is a document describing: ○ Techniques ○ Methods ○ Policies with the goal of enabling good data management practices. ● Funding bodies such as the EC, NSF and FCT already require that funding grant applications be accompanied by a DMP. Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services: ● Biodata. pt assists its communities in regard to Data Management by: ○ Providing information and training on Data Management practices. ○ Creating a functional digital repository. ○ Providing assistance in the creation and usage of DMPs. ○ Creating a collection of DMP templates. Bio. Data. pt|ELIXIR PT
Standards for Managing Plant Phenotype Data Bio. Data. pt|ELIXIR PT Daniel Faria
Platforms | Communities | Services Plant Sciences: ● Core ELIXIR community ● Co-lead by Bio. Data. pt and ELIXIR-FR ● IGC, i. BET and ITQB ● Focus on tree species Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services Interoperability: Data Submission Data Storage & Indexing Structure? Format? Access? FAIR Data Retrieval Interface? Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services Solution: Data Submission ● Minimum Information About Plant Phenotyping Experiments ○ 11 sections; 83 fields ○ Submission: spreadsheet; ISAtab; interface (WIP) Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services Solution: Data Storage & Indexing ● (Plant) Breeding API ● PT end-point (https: //brapi. biodata. pt/) ○ 51 tables; 255 fields ○ 4 datasets; 3 species "result": { "accession. Number": "A 000001", "acquisition. Date": "2019 -01 -01", "breeding. Method. Db. Id": "crossing", "common. Crop. Name": "cork oak", "country. Of. Origin. Code": "Portugal", "genus": "Quercus", "species": "suber", "germplasm. Name": "Quercus suber PTX 011", "institute. Name": "ITQB", "pedigree": "A 000001/A 000002", "seed. Source": "A 000001/A 000002" [. . . ] Bio. Data. pt|ELIXIR PT
Community Showcase The Cork Oak Genome Portal Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services Problem: HL 8 Cork oak tree selected for genome sequencing. Lia Rodrigues ? Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services Solution: Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services Bio. Data. pt|ELIXIR PT
Empowering Researchers with User-Friendly Applications Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services Illumina Next. Seq 500 Problem: How to scale-up support to a growing amount of data being generated, a broader user community? 10 x Genomics Chromium Controller Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services Solution: Give researchers the tools to be more independent in their analyses and to better control their own data. Use cases: - Differential expression analysis of RNAseq data - Analysis of single cell RNA-seq Open source web applications with a focus on: - Accessibility - Documentation - Reproducibility Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services Bio. Data. pt|ELIXIR PT
Platforms | Communities | Services Bio. Data. pt|ELIXIR PT
Thank you! E-mail: info@biodata. pt www. biodata. pt
- Slides: 27