Integration and analysis of multitype highthroughput data for
- Slides: 36
Integration and analysis of multi-type high-throughput data for biomolecular knowledge discovery Dr. Erik Bongcam-Rudloff SGBC-SLU Uppsala, Sweden
Biologists modus operandi Observing a phenomenon that is in some way interesting or puzzling. Making a guess as to the explanation of the phenomenon. Devising a test to show likely this explanation is to be true or false. Carrying out the test, and, on the basis of the results, deciding whether the explanation is a good one or not. In the latter case, a new explanation will (with luck) 'spring to mind' as a result of the first test. http: //www. biology. ed. ac. uk/archive/jdeacon/statistics/tress 2. html
The Observed phenomenon
Selection of test times
But was is the real event?
Sometimes you could be lucky Positive “Positive” results are used “negative” rejected Why? Only positive results are publishable
Next Generation techniques
New challenges 1 TB data
Gbases produced at Sanger
World NGS Map http: //omicsmaps. com/
But this is wonderful! Or? Sequence without knowledge connected to it is worth: 0 The deluge of data produced by these hordes of machines worldwide demand automatic workflows Complete new systems to shuffle data around Storage of never used amounts Machines with gigantic amounts of RAM
COSTS
PROBLEMS NOmenclature Publishing culture Moving target development Old ways of work and resistance to changes in culture
Publishing culture as example We get tax payers money, we pay publishers to publish, the publishers sell the articles and obtain the copy rights To connect knowledge to sequences we need automatic methods, workflows, text mining. Most of this is limited by close database systems. Only available is Pub. Med. But Pub. Med has only short abstracts. NO information about conditions, M&M etc We need to change this culture
The BLAST analogy. . . By far the most used tool by biologists Not possible if databases were not Open Access and freely searchable Imagine if Nucleotide and Protein databases followed the life science publishing model
BLAST
BLAST
BLAST
BLAST
BLAST
Human centric What about all other areas of the Life Sciences? Most genes are named by sequence similarity, but are the functions the same?
Microbiome A microbiome is the totality of microbes, their genetic elements (genomes), and environmental interactions in a particular environment. http: //www. secondgenome. com
Fat and lean Metabolic effects of transplanting gut microbiota from lean donors to subjects with metabolic syndrome. A. Vrieze et al, EASD abstracts, 24 September 2012. The result was: Lean donor faecal infusion improves hepatic and peripheral insulin resistance as well as fasting lipid levels in obese individuals with the metabolic syndrome
Genome sizes
How many species? Several orders of magnitude: Some estimates: 3 -50 million species of arthropods 1 -100 million species of nematodes Only a portion of bacterias have being identified, 99% of bacterias cannot be cultured. “Once the diversity of the microbial worldis catalogued, it will make astronomy to look like a pitiful science” Julian Davies, Professor Emeritus. UBC
New research strategies Microbial Livestock Plants
Typical Sources of Metagenomics Soil samples Sea water samples Air samples Medical samples Farm animal samples Ancient bones Human microbiome
Ion Proton: "Personal Genome Machine". Real tests of transcriptome sequencing on the Proton. Using 500 ng of input poly-A RNA, it was possible to generate 50 million reads from a melanoma cancer sample. Joe Boland of the National Cancer Institute according to Genomeweb. LIFE TECHNOLOGIES CORPORATION
Oxford Nanopore http: //www. nanoporetech. com/
High technology everywhere!
New applications Only imagination will put the limits of what its possible to be done using Next Generation Technologies!
The big challenge: Open Access, Open source, collaborative networks Data sharing Common language Tool systems to glue all together!!
Seq. Ahead COST Action BM 1006: Next Generation Sequencing Data Analysis Network. 2011 -2014 COST Action 25 countries http: //www. seqahead. eu/
ALLBIO 10 partners 8 countries FP 7 project Broadening the Bioinformatics Infrastructure to unicellular, animal, and plant science www. allbioinformatics. eu
THANKS!! Como 2012
- "data integration"
- Mashups meaning
- Google earth
- Three dimensions of corporate strategy
- Vertical diversification example
- Simultaneous integration examples
- Etl in data cleaning and preprocessing stands for
- Data integration in data preprocessing
- Fspos vägledning för kontinuitetshantering
- Typiska drag för en novell
- Tack för att ni lyssnade bild
- Ekologiskt fotavtryck
- Varför kallas perioden 1918-1939 för mellankrigstiden
- En lathund för arbete med kontinuitetshantering
- Underlag för särskild löneskatt på pensionskostnader
- Tidbok
- Sura för anatom
- Förklara densitet för barn
- Datorkunskap för nybörjare
- Stig kerman
- Debatt mall
- Autokratiskt ledarskap
- Nyckelkompetenser för livslångt lärande
- Påbyggnader för flakfordon
- Formel för lufttryck
- Svenskt ramverk för digital samverkan
- Lyckans minut erik lindorm analys
- Presentera för publik crossboss
- Vad är ett minoritetsspråk
- Kanaans land
- Klassificeringsstruktur för kommunala verksamheter
- Mjälthilus
- Bästa kameran för astrofoto
- Cks
- Programskede byggprocessen
- Mat för idrottare
- Verktyg för automatisering av utbetalningar