Ion Torrent Semiconductor Sequencing Mike Lelivelt Ph D

  • Slides: 10
Download presentation
Ion Torrent Semiconductor Sequencing Mike Lelivelt, Ph. D. , Director of Bioinformatics The content

Ion Torrent Semiconductor Sequencing Mike Lelivelt, Ph. D. , Director of Bioinformatics The content provided herein may relate to products that have not been officially released and is subject to change without notice.

Who am I? – Mike Lelivelt • • • Ph. D. from Univ of

Who am I? – Mike Lelivelt • • • Ph. D. from Univ of N Carolina in Microbial Genetics Post-Doc at Univ of WI Madison in Yeast Genomics 9 years at Affymetrix – software developer outreach 2 years at Partek – data anaylsis for arrays & NGS 3 years at Ion Torrent/Life Tech in bioinformatics • Familiar with the challenges of applying genomic scale assays into discrete, actionable decisions via software. • I’m here to educate about semiconductor sequencing. • I’m here to listen to your needs. 2 Confidential and Proprietary—DO NOT DUPLICATE

Opening thoughts… • "The wonderful thing about standards is that there are so many

Opening thoughts… • "The wonderful thing about standards is that there are so many of them to choose from. " –Andrew Tanenbaum • Driven more by the technology that we’d like to admit • Each technology platform serves multiple applications • A data standard implies a file format, but it’s really more about understanding data process flows • Broad scope of NGS will drive multi-marker haplotypes and introduces allele frequency measurements into the decision process. • Software is a tough business model. We’ll need to work together on this. 3 Confidential and Proprietary—DO NOT DUPLICATE

Simple Natural Chemistry Eliminate source error: – Modified bases – Fluorescent bases – Laser

Simple Natural Chemistry Eliminate source error: – Modified bases – Fluorescent bases – Laser detection H+ Eliminate read length limitations: – Unnatural bases – Protect/de-protect – Slow cycle time Sequence is determined by measuring hydrogen ions released (1 per base added per DNA strand) during 2 nd strand synthesis when complementary base (A, C, G or T) are sequentially incorporated by DNA polymerase.

Massively Parallel Post-Light Sequencing

Massively Parallel Post-Light Sequencing

Torrent Browser runs on Torrent Server Local compute and storage with an integrated web

Torrent Browser runs on Torrent Server Local compute and storage with an integrated web interface • Torrent Server – hardware appliance • Torrent Browser – easy web access to Ion data • Plugins for secondary analysis e. g. variant calling For Research Use Only. Not intended for any animal or human therapeutic or diagnostic use.

Data Flow Leverages Several Formats Incorporation for 1 Flow (DAT) Incorporation over many flows

Data Flow Leverages Several Formats Incorporation for 1 Flow (DAT) Incorporation over many flows (DAT) Raw signals per flow (WELLS) 0. 1 1. 2 0. 3 2. 1 0. 2 2. 1 3. 1 0. 0 0. 1 1. 2 0. 3 2. 1 0. 0 3. 2 1. 4 0. 1 1. 3 1. 0 0. 2 0. 1 Processed incorporations (SFF), but moving to unmapped BAM Flow space converted to base space (FASTQ) @7 D 8 NM: 4: 9 GGGATCAGGCTGTCGAACGCGTGATTACATCTAGCTA + AA*ABBBB? BBBBBBBABBB@@@BB? BABABCDA!@$ 0 1 0 2 0 0 3 0 0 1 0 4 0 1 0 3 0 2 2 0 0 1 0 0 0 3 0 4 0 1 1 0 2 0 0 3 0 4 0 1 0 3 0 2 0 0 0 1 1 TMAP binary TVC 7 BAM ##FORMAT=<ID=DP, Number=1 ##FORMAT=<ID=HQ, Number=2 #CHROM POS ID REF 20 14370 rs 6054257 G ALT A Variant Call Format (VCF) QUAL FILTER 29 PASS

What is raw data? Do you really want it? Process Description File Type 314

What is raw data? Do you really want it? Process Description File Type 314 chip 316 chip 318 chip Raw Voltage Data DAT 40 GB 180 GB 320 GB Signal Processing WELLS 1 GB 8 GB 12 GB Base Calls - Flow SSF/BAM 1 GB 5 GB 8 GB Base Calls - Base FASTQ 0. 3 GB 1. 5 GB 2 GB BAM 0. 1 GB 0. 6 GB 3. 5 GB Base Calls - Aligned 8 *1. 5 v run 200 bp runs (440 flows, 110 cycles), Nov 2011

Questions to Address • Are allele calls alone sufficient to call HLA types? –

Questions to Address • Are allele calls alone sufficient to call HLA types? – Likely not. More data is usually better. • Should HLA software be required to call novel alleles? – Speak no evil. See no evil. Hear no evil. – But software will serve the market. • Should novel alleles be submitted to IMGT/HLA? – Balance between social curation & data security. More than just allele info? • How should data be formatted to handle NGS richness? – Format is a snapshot in time. 9 Confidential and Proprietary—DO NOT DUPLICATE

All products mentioned in this presentation are for Research Use Only, not intended for

All products mentioned in this presentation are for Research Use Only, not intended for any animal or human therapeutic or diagnostic use. 10 Confidential and Proprietary—DO NOT DUPLICATE