3 genome analysis 1 The first DNAbased genome
3. genome analysis 1
The first DNA-based genome to be sequenced in its entirety was that of bacteriophage Φ-X 174; (5, 368 bp), sequenced by Frederick Sanger in 1976 3. genome analysis 2
The first DNA-based genome to be sequenced in its entirety was that of bacteriophage Φ-X 174; (5, 368 bp), sequenced by Frederick Sanger in 1976 There are several things to notice in this plot. First, the genome is circular. The density of the four nucleotides are plotted in the four outer-most circles. This density is not evenly distributed; although all four of the scales range from 0% (min. , no colour) to 40% (max colour intensity), it can be easily seen that the sequence is dominated by T's (red circle), and that there are relatively few G's (outermost turquoise circle) and C's (pink circle), and a few Arich regions (green 2 nd circle). There are many genes which overlap (the genes are indicated in the "annotation circle", which is the fifth circle from the outside - with the blue bands representing genes in the forward direction). GC Skew = (G - C)/(G + C) AT Skew = (A - T)/(A + T) http: //users. rcn. com/jkimball. ma. ultranet/Biology. Pages/P/Phi. X. html 3. genome analysis 3
exploring genome (and protein) sequences one language 3. genome analysis one basic tool 4
exploring genomes 3. genome analysis 5
exploring genomes 3. genome analysis 6
exploring genomes Broad Institute of MIT and Harvard Department of Energy Joint Genome Institute (Berkeley, USA) Institute for Genome Sciences (University of Mariland, Baltimore USA) 3. genome analysis 7
3. genome analysis 8
3. genome analysis 9
In whole genome shotgun sequencing (top), the entire genome is sheared randomly into small fragments (appropriately sized for sequencing) and then reassembled. In hierarchical shotgun sequencing (bottom), the genome is first broken into larger segments. After the order of these segments is deduced, they are further sheared into fragments appropriately sized for sequencing. 3. genome analysis 10
3. genome analysis 11
3. genome analysis 12
3. genome analysis 13
3. genome analysis 14
Circular maps of the chromosome and plasmids of enteropathogenic E. coli (da Iguchi A et al. J. Bacteriol. 2009) CDS = Co. Ding Sequence, region of nucleotides that corresponds to the sequence of amino acids in the predicted protein PP = prophage: a phage (viral) genome inserted and integrated into the circular bacterial DNA chromosome IE = integrative elements Circular maps of the chromosome and plasmids of EPEC strain E 2348/69. (A) EPEC strain E 2348/69 chromosome. From the outside in, the first circle shows the locations of PPs and IEs (purple, lambda-like PPs; light blue, other PPs; green, IEs and the LEE element), the second circle shows the nucleotide sequence positions (in Mbp), the third and fourth circles show CDSs transcribed clockwise and anticlockwise, respectively (gray, conserved in all eight other sequenced E. coli strains; red, conserved only in the B 2 phylogroup; yellow, variable distribution; blue, E 2348/69 specific), the fifth circle shows the t. RNA genes (red), the sixth circle shows the r. RNA operons (blue), the seventh circle shows the G+C content, and the eighth circle shows the GC skew. (B) EPEC strain E 2348/69 plasmids. The boxes in the outer and inner circles represent CDSs transcribed clockwise and anticlockwise, respectively. Pseudogenes are indicated by black boxes, and other CDSs are indicated by the colors described above for panel A. 3. genome analysis 15
3. genome analysis 16
3. genome analysis 17
3. genome analysis 18
3. genome analysis 19
3. genome analysis 20
3. genome analysis 21
3. genome analysis 22
3. genome analysis 23
3. genome analysis 24
3. genome analysis 25
3. genome analysis 26
3. genome analysis 27
3. genome analysis 28
Structural genomics ry t me to X y Ra c ra f if d NMR DNA cry o-e lect ron tom ogr aph y 0101#01001010#10111010# 01010001#10010#1001#101 10010#100100100101011#0 3. genome analysis Algorithm Residue THR 0. 0 147. 7 172. 9 THR 107. 2 -125. 3 187. 4 CYS 123. 4 63. 6 103. 7 PRO 60. 3 83. 9 -116. 7 Protein Structure 29
- Slides: 29