Rearrangements and Duplications in Tumor Genomes Tumor Genomes
- Slides: 62
Rearrangements and Duplications in Tumor Genomes
Tumor Genomes Mutation and selection Compromised genome stability • Chromosomal aberrations – Structural: translocations, inversions, fissions, fusions. – Copy number changes: gain and loss of chromosome arms, segmental duplications/deletions.
Rearrangements in Tumors Change gene structure, create novel fusion genes • Gleevec (Novartis 2001) targets ABL-BCR fusion
Rearrangements in Tumors Alter gene regulation Burkitt lymphoma translocation IMAGE CREDIT: Gregory Schuler, NCBI, NIH, Bethesda, MD, USA Regulatory fusion in prostate cancer (Tomlins et al. Science Oct. 2005)
Complex Tumor Genomes 1) 2) 3) 4) What are detailed architectures of tumor genomes? What genes affected? What processes produce these architectures? Can we create custom treatments for tumors based on mutational spectrum? (e. g. Gleevec)
Common Alterations across Tumors • Mutations activate/repress circuits. • Multiple points of attack. • “Master genes”: e. g. p 53, Myc. • Others probably tissue/tumor specific. activation repression Duplicated genes Deleted genes
Human Cancer Genome Project etc. • • What tumors to sequence? What to sequence from each tumor? 1. Whole genome: all alterations 2. Specific genes: point mutations 3. Hybrid approach: structural rearrangements
Human Cancer Genome Project etc. • • What tumors to sequence? What to sequence from each tumor? 1. Whole genome: all alterations 2. Specific genes: point mutations 3. Hybrid approach: structural rearrangements
End Sequence Profiling (ESP) C. Collins and S. Volik (UCSF Cancer Center) 1) Pieces of tumor genome: clones (100250 kb). Tumor DNA 2) Sequence ends of clones (500 bp). Human DNA x y 3) Map end sequences to human genome. Each clone corresponds to pair of end sequences (ES pair) (x, y). Retain clones that correspond to a unique ES pair.
End Sequence Profiling (ESP) C. Collins and S. Volik (UCSF Cancer Center) 1) Pieces of tumor genome: clones (100250 kb). Tumor DNA 2) Sequence ends of clones (500 bp). L Human DNA x y 3) Map end sequences to human genome. Valid ES pairs • l ≤ y – x ≤ L, min (max) size of clone. • Convergent orientation.
End Sequence Profiling (ESP) C. Collins and S. Volik (UCSF Cancer Center) 1) Pieces of tumor genome: clones (100250 kb). Tumor DNA 2) Sequence ends of clones (500 bp). L Human DNA x y 3) Map end sequences to human genome. Invalid ES pairs • Putative rearrangement in tumor • ES directions toward breakpoints
Outline What does ESP reveal about tumor genomes? 1. Identify locations of rearrangements. 2. Reconstruct genome architecture, sequence of rearrangements. 1. 3. In combination with other genome data (CGH).
ESP Data (Jan. 2006) Breast Cancer Cell Lines Tumors BT 474 MCF 7 SKBR 3 Brain Breast 1 Breast 2 Ovary Prostate Normal • Coverage of human genome: ≈ 0. 34 for MCF 7, BT 474 Clones 5267 5031 9580 7623 19831 9612 4246 1756 9267 ES pairs 3923 3448 7994 5588 6785 3222 12073 1300 7300
1. Rearrangement breakpoints MCF 7 breast cancer • Known cancer genes (e. g. ZNF 217, BCAS 3/4, STAT 3) • Novel candidates near breakpoints. • Small-scale scrambling of genome more extensive than expected.
Structural Polymorphisms • Human genetic variation more than nucleotide substitutions • Short indels/inversions present • (Iafrate et al. 2004, Sebat et al. 2004, Tuzun et al. 2005, Mc. Carroll et al. 2006, Conrad et al. 2006 etc. ) • ≈ 3% (53/1570) invalid ES pairs explained by known structural variants. Reference Human A B s 1. 6 Mb inversion Human Variant C t inversion A -B s C t
2. Tumor Genome Architecture 1) What are detailed architectures of tumor genomes? 2) What sequence of rearrangements produce these architectures?
ESP Genome Reconstruction Problem A C B E D Unknown sequence of rearrangements Human genome (known) Tumor genome (unknown) Map ES pairs to human genome. Reconstruct tumor genome x 1 x 2 x 3 x 4 y 1 y 2 x 5 y 4 y 3 Location of ES pairs in human genome. (known)
ESP Genome Reconstruction Problem A C B E D Unknown sequence of rearrangements Human genome (known) Tumor genome (unknown) A -C -D Map ES pairs to human genome. Reconstruct tumor genome x 1 x 2 E B x 3 x 4 y 1 y 2 x 5 y 4 y 3 Location of ES pairs in human genome. (known)
ESP Genome Reconstruction: Comparative Genomics E B Tumor -D A -C -D B E -C A A B C Human D E
ESP Genome Reconstruction: Comparative Genomics E B Tumor -D -C A A B C Human D E
ESP Genome Reconstruction: Comparative Genomics E B Tumor -D -C A A B C Human D E
ESP Genome Reconstruction: Comparative Genomics E (x 3, y 3) B (x 2, y 2) Tumor -D (x 4, y 4) -C (x 1, y 1) A A B x 1 x 2 C x 3 x 4 D y 1 y 2 E y 4 y 3
ESP Plot E (x 3, y 3) (x 4, y 4) D (x 2, y 2) 2 D Representation of ESP Data Human (x 1, y 1) C • Each point is ES pair. • Can we reconstruct the tumor genome from the positions of the ES pairs? B A A B C Human D E
ESP Plot E D 2 D Representation of ESP Data Human C • Each point is ES pair. • Can we reconstruct the tumor genome from the positions of the ES pairs? B A A B C Human D E
ESP Plot → Tumor Genome E E D -D Human C -C B A A Reconstructed Tumor Genome A B -C C Human -D D B E E
E D 2 D Representation of ESP Data Human C • Each point is ES pair. • Can we reconstruct the tumor genome from the positions of the ES pairs? B A A B C Human D E
2 D Representation of ESP Data Human • Each point is ES pair. • Can we reconstruct the tumor genome from the positions of the ES pairs? Human
Real data noisy and incomplete! Valid ES pairs • satisfy length/direction constraints l≤y–x≤L Invalid ES pairs • indicate rearrangements • experimental errors
Computational Approach 1. Use known genome rearrangement mechanisms Human A Tumor B C B s A t s A inversion t C t s D translocation C -B -C A s -B D t 2. Find simplest explanation for ESP data, given these mechanisms. 3. Motivation: Genome rearrangements studies in phylogeny.
ESP Sorting Problem • G = [0, M], unichromosomal genome. • Reversal s, t(x)= x, if x < s or x > t, t – (x – s), otherwise. A B C y 1 t x 1 s A x 2 y 2 -B x 1 y 1 x 2 y 2 G G’ = G Given: ES pairs (x 1, y 1), …, (xn, yn) Find: Minimum number of reversals s 1, t 1, …, sn, tn such that if = s 1, t 1… sn, tn then ( x 1, y 1 ), …, ( xn, yn) are valid ES pairs.
A B x 1 s C y 1 t y 2 x 3 y 3 A -B -C x 1 y 3 x 2 y 2 t Sequence of reversals. s s t All ES pairs valid.
Filtering Experimental Noise 1) Pieces of tumor genome: clones (100 -250 kb). Tumor DNA Rearrangement Cluster invalid pairs Human DNA 2) Sequence ends of clones (500 bp). Chimeric clone Isolated invalid pair x y 3) Map end sequences to human genome.
Sparse Data Assumptions 1. Each cluster results from single inversion. human x 1 x 2 x 3 tumor y 2 y 1 y 3 x 1 x 2 y 1 y 2 x 3 2. Each clone contains at most one breakpoint. tumor y 3
ESP Genome Reconstruction: Discrete Approximation Human 1) Remove isolated invalid pairs (x, y) Human
ESP Genome Reconstruction: Discrete Approximation Human 1) Remove isolated invalid pairs (x, y) 2) Define segments from clusters Human
ESP Genome Reconstruction: Discrete Approximation Human 1) Remove isolated invalid pairs (x, y) 2) Define segments from clusters 3) ES Orientations define links between segment ends Human
ESP Genome Reconstruction: Discrete Approximation (x 2, y 2) (x 3, y 3) (x 1, y 1) s Human 1) Remove isolated invalid pairs (x, y) 2) Define segments from clusters 3) ES Orientations define links between segment ends Human t
ESP Graph 5 5 4 4 3 3 Edges: 1. Human genome segments 2. ES pairs Paths in graph are tumor genome architectures. 2 2 1 Tumor genome (1 -3 -4 2 5) = signed permutation of (1 2 3 4 5) 1 1 2 3 4 5
Sorting permutations by reversals (Sankoff et al. 1990) = 1 2… n signed permutation Reversal (i, j) [inversion] 1… i-1 - j. . . - i j+1… n Problem: Given , find a sequence of reversals 1, …, t with such that: ¢ 1 ¢ 2 ¢ ¢ ¢ t = (1, 2, …, n) and t is minimal. Solution: Analysis of breakpoint graph ← ESP graph Polynomial time algorithms O(n 4) : Hannenhalli and Pevzner, 1995. O(n 2) : Kaplan, Shamir, Tarjan, 1997. O(n) [distance t] : Bader, Moret, and Yan, 2001. O(n 3) : Bergeron, 2001.
Sorting Permutations 1 -3 -4 2 5 1 -3 -2 4 5 1 2 3 4 5
Breakpoint Graph start 1 -3 -4 2 5 end Black edges: adjacent elements of Gray edges: adjacent elements of i =12345 start 1 2 3 4 5 end Key parameter: Black-gray cycles
Breakpoint Graph start 1 -3 -4 2 5 end start 1 -3 -2 4 5 end start 1 2 3 4 5 end Black edges: adjacent elements of Gray edges: adjacent elements of i =12345 Key parameter: Black-gray cycles Theorem: Minimum number of reversals to transform to identity permutation i is: d( ) ≥ n+1 - c( ) where c( ) = number of gray-black cycles. ESP Graph → Tumor Permutation and Breakpoint Graph
MCF 7 Breast Cancer Cell Line • Low-resolution chromosome painting suggests complex architecture. • Many translocations, inversions.
ESP Data from MCF 7 tumor genome Each point (x, y) is ES pair. • 6239 ES pairs (June 2003) • 5856 valid (black) • 383 invalid • 256 isolated (red) • 127 form 30 clusters (blue) Coordinate in human genome
MCF 7 Genome Human chromosomes Sequence of 5 inversions 15 translocations MCF 7 chromosomes Raphael, Volik, Collins, Pevzner. Bioinformatics 2003.
3. Combining ESP with other genome data Array Comparative Genomic Hybridization (a. CGH)
CGH Analysis • Divide genome into segments of equal copy number Copy number profile Genome coordinate
CGH Analysis • Divide genome into segments of equal copy number Copy number profile Genome coordinate Numerous methods (e. g. clustering, Hidden Markov Model, Bayesian, etc. ) Segmentation No information about: • Structural rearrangements (inversions, translocations) • Locations of duplicated material in tumor genome.
CGH Segmentation Copy number 5 3 2 Genome Coordinate How are the copies of segments linked? ? ? ES pairs links segments Tumor genome
ESP + CGH Copy number 5 3 2 Genome Coordinate CGH breakpoint ESP breakpoint ES near segment boundaries
ESP and CGH Breakpoints ESP breakpoints MCF 7 CGH breakpoints 730 256 39 (P = 1. 2 x 10 -4) ESP breakpoints BT 474 12/39 clusters CGH breakpoints 426 244 33 (P = 5. 4 x 10 -7) 8/33 clusters
Copy number Microdeletion in BT 474 ES pair 3 2 0 ≈ 600 kb “interesting” genes in this region Valid ES pair < 250 kb
Combining ESP and CGH Copy number 5 3 2 Genome Coordinate ES pairs links segments. Copy number balance at each segment boundary: 5 = 2 + 3.
Combining ESP and CGH Copy number 5 3 ≤ f(e) ≤ 5 1 ≤ f(e) ≤ 4 3 2 1 ≤ f(e) ≤ 3 Genome Coordinate • CGH copy number not exact. • What genome architecture “most consistent” with ESP and CGH data?
Combining ESP and CGH Copy number 5 3 2 Genome Coordinate 3 ≤ f(e) ≤ 5 1 ≤ f(e) ≤ 3 1 ≤ f(e) ≤ 4 Build graph 1. 2. 3. Edge for each CGH segment. Edge for each ES pair consistent with segments. Range of copy number values for each CGH edge.
Network Flow Problem f(e) Flow constraints: l(e) ≤ f(e) ≤ u(e) Flow constraint on each CGH edge l(e) ≤ f(e) ≤ u(e) 8 e CGH edge: l(e) and u(e) from CGH ESP edge: l(e) = 1, u(e) = 1
Network Flow Problem f(e) Flow constraints: l(e) ≤ f(e) ≤ u(e) Flow in = flow out at each vertex l(e) ≤ f(e) ≤ u(e) 8 e (u, v) f( (u, v) ) = (v, w) f( v, w) ) 8 v CGH edge: l(e) and u(e) from CGH ESP edge: l(e) = 1, u(e) = 1
Network Flow Problem • Minimum Cost Circulation with Capacity Constraints (Sequencing by Hybridization, Sequence Assembly) Flow constraints: l(e) ≤ f(e) ≤ u(e) f(e) CGH edge: l(e) and u(e) from CGH ESP edge: l(e) = 1, u(e) = 1 Source/sink min e (e) Subject to: l(e) ≤ f(e) ≤ u(e) 8 e (u, v) f( (u, v) ) = (v, w) f( v, w) ) 8 v Costs: (e) = 0, e ESP or CGH edge 1, e incident to source/sink
Network Flow Results f(e) Source/sink • Unsatisfied flow are putative locations of missing ESP data. • Prioritize further sequencing. • Targeted ESP by screening library with CGH probes.
Network Flow Results • Identify amplified translocations – 14 in MCF 7 – 5 in BT 474 • Eulerian cycle in combined graph gives tumor genome architecture. Flow values → Edge multiplicities
Human Cancer Genome Project etc. • • What tumors to sequence? What to sequence from each tumor? 1. Whole genome: all alterations 2. Specific genes: point mutations 3. Hybrid approach: structural rearrangements
Human Cancer Genome Project
- Chapter 18 genomes and their evolution
- Computational biology: genomes, networks, evolution
- Difference between proto oncogene and oncogene
- Benign and malignant tumor
- Tümör spesifik ağ "yüklü" dendritik hücre aşısı
- Brown sign ent
- Serpil köylüce
- Koristoma
- Fibrohistiyositik tümör nedir
- Most common salivary gland tumor
- Grawitz tumor
- Urinary bladder carcinoma
- Hormones and breast discharge
- Warthin's tumor ultrasound images
- Phyllodes tumor mammogram
- Tumor treating fields mechanism of action
- Tumor lateral de cuello
- 7 habits of highly defective teens
- How big is 5 cm tumor
- Borderline tumor jajnika
- Benign tumor definition
- Kti tumor otak
- Dormant tumor
- Warthin tumor
- Rubor tumor calor dolor
- Kode icd 10 tumor supraclavicular
- Icd 10 ca nasofaring
- Atresia aural
- Globulomaksiller kist
- Betf
- Disembriyoplastik nöroepitelyal tümör
- Fibrosarcoma gross
- Tumor teratóide rabdóide atípico
- Tumor suppressor genes
- Caso bruna viana
- Trias tumor ginjal
- Call exner bodies
- Follicular adenoma
- Mast cell tumor german shorthaired pointer
- Tumor angiogenesis
- Tumor detection
- Tumor suppressor genes
- Lengua geográfica
- Mesothelioma
- Tumor de wilms
- Esofago
- Penanda tumor ca 19-9
- Seks kord
- Normosefali
- Carcinoma bronquioloalveolar
- Giant cell tumor
- Carcinoid tumor stomach
- Codman's triangle
- Primitive neuroectodermal tumor
- Diagnosis banding karsinoma nasofaring
- Campanacci grading
- Pyloric adenoma
- Tumor mixtus
- Primitive neuroectodermal tumor
- Tumor immunology
- Brain tumor
- Vaginal tumor
- Ovarian benign tumor