Greedy Algorithms in the Libraries of Biology 17
Greedy Algorithms in the Libraries of Biology 17 -Apr-2008 3: 30 -3: 45 PM Avogadro-Scale Computing MIT Bartos E 15 Thanks to: PGP
Is biology optimal? Human Past Locomotion 50 km/h Ocean depth 75 m Visible l. 4 -. 7 m Cold 0 o. C Memory 20 yr Present 26720 km/h 4500 m pm-Mm 3 o. K 2000 yr
3 Exponential technologies 1 to 18 month doubling times Computation & Communication Gb chips human urea Synthetic chemistry t. RNA B 12 telegraph Analytic t. RNA Shendure J, Mitra R, Varma C, Church GM, 2004. Carlson 2003; Kurzweil 2002; Moore 1965.
Avogadro scale, >>Yottaflops (from CMOS to sea moss) Ultra-parallel 1038 units (lab libraries: 108 to 1015 25 mers) Adaptable Evolution (years), Immune (days), Neural (seconds) Thermodynamic limit 2 x 1019 op/J (irreversible) 3 x 1020 for polymerase (1010 for current computers) Memory density: Neural: (1012 op/s & 106 bits)/mm 3, DNA: (103 op/s & 1 bit)/nm 3 Error rate: DNA: 10 -9 ; RNA/protein: 10 -4 Adleman 1994 Biofuel: 4 x 107 J/kg (~=$)
DNA error rates 3. Mismatch repair DNA Replication Fork 2. Proofreading exonuclease 3’to 5’ 1. Incorporation 5’to 3’ Ellis et al. PNAS 2001 Constantino & Court. PNAS 2003
Bionano – Inorganic-microfab interfaces • Metal-oxide-semiconductors (sponge silicateins for Ti & Ga oxides) • Magnetic components (magnetosomes in magnetotactic bacteria) • Optical fibers & lenses (e. g. venus basket sponge) • Bacterial reduction of salts to metals (e. g. Se, Au, Ag) • Reading and writing DNA
Reading DNA : Open-source hardware, software, wetware Polonator G 007 ~10 to $400/Gbp 1 E-6 @ >3 X redundancy
Synthetic Biology: augmentation & combinatorics (not minimization) 1. 2. 3. 4. 5. 6. 7. Synthetic DNA: 1 Mbp per month (Codon Devices) New polymers in vitro – affinity selection (Vanderbilt) Hydrocarbon & other chemical syntheses in E. coli (LS 9) Bacterial & stem cell therapies (Syn. BERC & MGH) New codes: Viral resistant cells & new aminoacids (MIT) Synthetic Ecosystems – Evolve secretion & signaling Interfaces of Genomics & Society Hierarchical, modular, evolvable
DNA origami -- highly predictable 3 D nanostructures Rothemund Nature’ 06 Douglas, et al. PNAS’ 07 DNA-nanotube-induced alignment of membrane proteins for NMR structure determination
10 Mbp of DNA / $300 chip Spatially patterned chemistry 8 K Atactic/Xeotron/Invitrogen Photo-Generated Acid 12 K Combimatrix Electrolytic 44 K Agilent Ink-jet standard reagents 380 K Nimblegen/GA Photolabile 5'protection Amplify pools of 50 mers using flanking universal PCR primers & 3 paths to 10 X error correction Tian et al. Nature. 432: 1050 Carr & Jacobson 2004 NAR Smith & Modrich 1997 PNAS
Mirror world : resistant to enzymes, parasites, predators Mirror aptamers, ribozymes, etc. require mirror polymerases 352 aminoacid long Dpo 4 Sulfolobus DNA polymerase IV 347 peptide bonds done; 4 to go. L-aminoacids D-nucleotides (current biosphere) D-aminoacids L-nucleotides (Mirror-biopolymers)
Why synthesize (minimal) in vitro self-replication? • Molecular Biology Central Dogma DNA > RNA > Protein PCR, T 7 RNA pol, in vitro translation. • Production of devices larger than or toxic to cells. • Directed evolution of drugs & affinity agents. • Mirror-image proteins Duhee Bang (HMS) Tony Forster (Vanderbilt)
Pure in vitro translating & replicating system ideal for comprehensive atomic, ODE & stochastic models Forster & Church MSB ‘ 05 Genome. Res. ’ 06 Shimizu, Ueda et al ‘ 01 113 kbp DNA 151 genes
Genome engineering CAD Polymerase in vitro Recombination in vivo E. coli 70 b 15 Kb Error Correction Mut. S 1 E-4 Chemical Synthesis 1 E-2 Recombination in human cells 5 Mb 250 Mb Bacterial (Artificial) Chromosomes BACs Human(Artificial) Chromosomes HACs Sequencing 1 E-7 Isaacs, Carr, Emig, Gong, Tian, Reppas, Jacobson, Church
Native DNA computing : Lab Evolution About 3 serial additive changes per 30 days vs 2^30 exhaustive search Reppas/Lin Tolonen Lenski Palsson Edwards Ingram Marliere J&J Du. Pont Trp/Tyr exchange Ethanol resistance Citrate utilization Glycerol utilization Radiation resistance Lactate production Thermotolerance Diarylquinoline resistance (TB) 1, 3 -propanediol production
r. E. coli Strategy #3: ss-Oligonucleotide Repair DNA Replication Fork Ellis et al. PNAS 2001 Constantino & Court. PNAS 2003 Obtain 25% recombination efficiency in E. coli strains lacking mismatch repair genes (mut. H, mut. L, mut. S, uvr. D, dam) Improved Recombination Frequency: 10 -4 0. 25 (> 3 log increase!)
Multiplex Automated Genome Engineering (MAGE) Wash with water & DNA pool (50) Concentrate O-ring Concentrate, electroporate Wang, Isaacs, Terry membrane Resuspend, bubble, select
GEMASS Prototype H. Wang, Church Lab, Harvard, 2008
Recombination-Cycling for Combinatorial Accelerated Evolution Mutation Distribution: 11 oligos, 15 cycles Mutation Distribution: 54 oligos, 45 cycles Oligo Pool # cycles Best Clone (98 %tile) Fraction of mutated sites Time* 11 15 7 7/11 3 days 54 45 23 23/54 9 days * Continuous cycling Ø Scaling & Automation Ø Increase Efficiency of Recombination Wang, Isaacs, Carr, Jacobson, Church
Avogadro scale, >>Yottaflops (from CMOS to sea moss) Ultra-parallel 1038 units (lab libraries: 108 to 1015 25 mers) Adaptable Evolution (years), Immune (days), Neural (seconds) Thermodynamic limit 2 x 1019 op/J (irreversible) 3 x 1020 for polymerase (1010 for current computers) Memory density: Neural: (1012 op/s & 106 bits)/mm 3, DNA: (103 op/s & 1 bit)/nm 3 Error rate: DNA: 10 -9 ; RNA/protein: 10 -4 Adleman 1994 Biofuel: 4 x 107 J/kg (~=$)
Multiplex Automated Genome Engineering (MAGE) syringe pump computer communication / data acquisition system electrically actuated valves OD sensor electroporation cuvette w/ membrane filter Wang, Isaacs, Terry
Fab vs. Bio-fab + Plays well with digital computers - Doesn’t get DNA - Needs us to replicate - Needs expensive Fab (e. g. ICs) - Intelligent Design - No habla C++ + DNA is it’s native digital media + We need them + Simple or complex inputs + Evolution
Cross-feeding symbiotic systems: aphids & Buchnera • • obligate mutualism nutritional interactions: amino acids & vitamins established 200 -250 million years ago close relative of E. coli with tiny genome (618~641 kb) MILKFTWV HR Aphids http: //buchnera. gsc. riken. go. jp
Pink= enzymes apparently missing in Bucherna Shigenobu et al. Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS. Nature 407, 81 -86 (2000).
Synthetic genome pair evolution Second Passage First Passage trp/ tyr. A pair of genomes shows best cogrowth Reppas, Lin et al. ; Accurate Multiplex Polony Sequencing of an Evolved Bacterial Genome 2005 Science
Co-evolution of mutual biosensors/biosynthesis sequenced across time & within each time-point Independent lines of Trp. D & Tyr. D co-culture 5 Omp. F: (pore: large, hydrophilic > small) 42 R-> G, L, C, 113 D->V, 117 E->A 2 Promoter: (cis-regulator) -12 A->C, -35 C->A 5 Lrp: (trans-regulator) 1 b. D, 9 b. D, 8 b. D, IS 2 insert, R->L in DBD. Heterogeneity within each time-point. At late times Tyr- becomes prototroph! Reppas, Shendure, Porecca -12 -11 -10 -9 -8 -7 -6
Reducing costs of open-source hardware & wetware Factor • 30 Equipment speed: from 1 up to 30 Mpixels/sec camera • 4 Equipment cost: from $500 K down to $150 K (Danaher Inc) • 36 Parallelism: 36 flow-cells per camera, 2 billion beads --------- • 75 Flow cell volume: 1. 5 mm down to 0. 0085 mm thin • 40 Kit costs: $2000 down to $50 at standard enzyme costs • 10 Enzymes: $4000/mg down to <$400 (Enzymatics Inc) • 50 Genomic subset (Exome – 1% genome)
- Slides: 28