Cloud Genomics Deployment Examples David A Shaywitz MD

  • Slides: 29
Download presentation
Cloud Genomics: Deployment Examples David A. Shaywitz, MD, Ph. D Chief Medical Officer DNAnexus

Cloud Genomics: Deployment Examples David A. Shaywitz, MD, Ph. D Chief Medical Officer DNAnexus Proprietary and Confidential

Cloud Genomics Has Arrived

Cloud Genomics Has Arrived

Enabling Technology 1: Next Generation Sequencing Source: The Economist

Enabling Technology 1: Next Generation Sequencing Source: The Economist

Enabling Technology 2: Cloud Computing Venter Congressional Testimony 7/17/2014 “There were two thresholds we

Enabling Technology 2: Cloud Computing Venter Congressional Testimony 7/17/2014 “There were two thresholds we just passed…. One was the sequencing technology that just barely passed the threshold of cost and accuracy, but the most important changes are in the computer world. And we are going to rely very heavily on cloud computing, not only to house this massive database, but to be able to use it internationally…. The cloud sort of makes that seamless…. ” “Trying to move things from my institute in Rockville, MD to La Jolla, we had dedicated fiber but it is now so slow with these massive data sets, we use Sneakernet or Fed. Ex to send disks because we can’t send it by what you would think would be normal transmissions. So the use of the cloud is the entire future of this field. ” FDA Now Embracing Cloud “Taha Kass-Hout, FDA’s chief health informatics officer, further provided some insights at the briefing about how the agency is thinking about creating an infrastructure for sharing the data in these repositories. Generally, in building out its bioinformatics capabilities, the FDA is thinking about storing curated data in the cloud, or on a hybrid of cloud and on-premise systems. “ - Genome Web, 2/18/2015 NIH Now Embracing Cloud - - Updated policy 3/2015 permits use of appropriate cloud platform for controlled-access data DNAnexus specifically called out by NIH for best practices around security and compliance in cloud

Why Cloud? • Computation power: • Security: • Integration & Collaboration

Why Cloud? • Computation power: • Security: • Integration & Collaboration

How DNAnexus Leverages The Cloud: Partnering With Visionary Organizations…

How DNAnexus Leverages The Cloud: Partnering With Visionary Organizations…

… To Create A Global Genetic Data Network Medical Centers Diagnostic Test Providers Sequence

… To Create A Global Genetic Data Network Medical Centers Diagnostic Test Providers Sequence Service Providers/ CROs Pharmas Analyze, Store, Share – securely & safely Ingest Raw Sequence NGS Sequencing Instrument

What the DNAnexus secure cloud platform enables • Effortless global collaboration (w/o Fed. Ex-ing

What the DNAnexus secure cloud platform enables • Effortless global collaboration (w/o Fed. Ex-ing disks…) – Intuitive and effortless collaboration among authorized partners – Maintains and documents provenance of data at every step • Data sharing within exceptional safety framework – Security that almost certainly exceeds that of any given local cluster – Rigorously define control of data access, dense audit trail – Highlighted by NIH for industry best practices around security and compliance • Access to massive computational power – Potential to relieve analytic stuck-point created by NGS – Essentially unlimited compute you can call up when you need it and can shut down when you don’t

Use Case 1: Distributed Sequencing, Central Processing • Sequencing from multiple sources (sequencing centers,

Use Case 1: Distributed Sequencing, Central Processing • Sequencing from multiple sources (sequencing centers, CROs, others) uploaded and centrally analyzed • Separation of data collection/sequencing from data analysis/processing means you can optimize and control the quality and cost of the two separately

Distributed sequencing, centralized processing: Commercial NIPT test example Actual customer clinical test workflow REPORT

Distributed sequencing, centralized processing: Commercial NIPT test example Actual customer clinical test workflow REPORT PATIENT CLINICAL LAB (Millions) (Thousands) (Hundreds) COMPANY X PIPELINE Genome sequencing offers a huge opportunity for clinical genetics services to massively expand their ability to diagnose and characterize rare and serious heritable genetic disorders. COMPANY X REPORT GENERATOR BILLING SYSTEM

Effortless collaboration among distributed partners: Academic consortium example • Flagship project of US National

Effortless collaboration among distributed partners: Academic consortium example • Flagship project of US National Human Genome Research Institute. Over $300 million spent since 2004. • Goal of Encyclopedia of DNA Elements (ENCODE) to map biochemical and regulatory activities of the human genome using next generation sequencing. • Foundational multi-petabyte data-set, involving investigators at multiple top-tier institutions. Data Coordination Sequencing Scientific Community ENCODE DNAnexus is the platform uniting genomic research collaboration between multiple institutions working on the ENCODE project. DNAnexus Company Overview, Sep 2014 | Page Confidential

Use Case 2: Large Scale Data Integration To Guide Drug Discovery Secure Cloud Data

Use Case 2: Large Scale Data Integration To Guide Drug Discovery Secure Cloud Data Platform: Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

RGC Sequencing Technology Samples (extracted DNA) Library Prep Automation Data 1. 4 M Sample

RGC Sequencing Technology Samples (extracted DNA) Library Prep Automation Data 1. 4 M Sample Bio. Bank Adapted from J Reid (Regeneron) presentation at Re: Invent 2014 Illumina Hi. Seq Fleet

The Regeneron Genetics Center (RGC) data center Adapted from J Reid (Regeneron) presentation at

The Regeneron Genetics Center (RGC) data center Adapted from J Reid (Regeneron) presentation at Re: Invent 2014

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

DNAnexus at RGC Dr. Jeff Reid, Head of Genome Informatics Source: Genome. Web, May

DNAnexus at RGC Dr. Jeff Reid, Head of Genome Informatics Source: Genome. Web, May 17, 2015

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Cloud-based discovery 140 40 Adapted from J Reid (Regeneron) presentation at Re: Invent 2014

Cloud-based discovery 140 40 Adapted from J Reid (Regeneron) presentation at Re: Invent 2014

RGC • Now at 1600 exomes/week • Sequencing: – Whole exome sequencing • Population-based

RGC • Now at 1600 exomes/week • Sequencing: – Whole exome sequencing • Population-based (Geisinger) • Family-based (CUMC) • Special founder populations (Amish) – Targeted sequencing • 100 s of genes, 10, 000 s of people • Allows rapid hypothesis evaluation Source: Genome. Web, 3/17/2015

Two Strategies For Genomics and Drug Discovery (RGC doing both) • Phenotype-first (canonical eg:

Two Strategies For Genomics and Drug Discovery (RGC doing both) • Phenotype-first (canonical eg: GWAS) – – Define relevant phenotype of clinical interest Extract binary and/or quantitative phenotype data from EHR Look for genes associated with phenotype Especially attractive to find loss-of-function (LOF) variants that seem protective • Genotype-first (canonical eg: Phe. WAS): – Start w/candidate gene(s) or variant(s) of particular interest – Extract binary and/or quantitative phenotype data from EHR – Look for phenotypes associated with dysfunctional gene

What Integrated Data Approach Enables: Rapid Hypothesis Development and Evaluation • Example 1: The

What Integrated Data Approach Enables: Rapid Hypothesis Development and Evaluation • Example 1: The false positive – Several promising variants identified in study of individual family – However, ability to query Geisinger population revealed variant relatively common and not associated with disease – Result: quickly discard hypothesis and move on • Example 2: The true positive – Obesity-associated gene identified in study of individual family – Ability to query Geisinger population revealed range of mutations in this gene associated with obesity/high BMI – Result: prioritization as potential drug target • Example 3: From EMR to potential target – Patient with clinically-interesting phenotype (e. g. metabolically “healthy” obese patient) identified in Geisinger population – Exome sequencing revealed potentially interesting (e. g. LOF) variant – Further evaluation in progress Source: Genome. Web, 3/17/2015; AWS Re-Invent 2014

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Source: John Penn, Regeneron presentation, 4/21/2015 Bio-IT World, Boston

Our Goal: Empower Data Champions • Innovation driven by champions: – Flowers & Melmon,

Our Goal: Empower Data Champions • Innovation driven by champions: – Flowers & Melmon, Nature Medicine 1997: New drugs associated with clinical champion – not MD vs Ph. D, but someone who championed program – Judah Folkman, Soma Weiss Talk, Harvard: Innovation driven by “inquisitive physicians” and “inquisitive researchers” – Margaret Mead: "Never doubt that a small group of thoughtful, committed citizens can change the world; indeed, it's the only thing that ever has. ” • Goal of DNAnexus platform: empower inquisitive physicians, researchers, organizations, and data science champions

Final word Source: Reuters, June 5, 2015: http: //www. reuters. com/article/2015/06/05/us-health-genomics-cloud-insight-id. USKBN 0 OL

Final word Source: Reuters, June 5, 2015: http: //www. reuters. com/article/2015/06/05/us-health-genomics-cloud-insight-id. USKBN 0 OL 0 BG 20150605

® The Global Network For Genomic Medicine TM

® The Global Network For Genomic Medicine TM