Dot Plots DNA dot plots Identification of regions




























- Slides: 28
Dot Plots
DNA dot plots Identification of regions of – Similarity between two sequences – Insertions-deletions: Introns – Repetitive regions (self-self analysis) – Inverted repeats
Repeats • All DNA sequences contain repeats
Repeats • All DNA sequences contain repeats
Window size • Window size 1
Window size • Window size 9
Exercise Practice for, a) window size 1 b) window size 3 C C Sequence 2 T A A A G G A A A Sequence 1 T C C
Exercise Window size 1 C Identity C Sequence 2 T A A A G G A A A Sequence 1 T C C
Exercise Window size 3 C Not considered C Sequence 2 T A A A G G A A A Sequence 1 T C C
Exercise Window size 3 C C GGA Sequence 2 T A A = 3 / 3 identities A G 3 G G G A A A Sequence 1 T C C
Exercise Window size 3 C C GGA GAA Sequence 2 T A A = 2 / 3 identities A 2 G 3 G G G A A A Sequence 1 T C C
Exercise Window size 3 C C GGA AAA Sequence 2 T A A 1 A 2 G 3 = 1 / 3 identities G G G A A A Sequence 1 T C C
Exercise Window size 3 C C GGA AAT Sequence 2 T A 0 A 1 A 2 G 3 = 0 / 3 identities G G G A A A Sequence 1 T C C
Exercise Window size 3 Sequence 2 C C 0 0 1 3 T 0 0 1 1 3 1 A 0 1 2 3 1 0 A 1 2 3 2 1 0 A 2 3 2 1 0 0 G 3 2 1 0 0 0 G A A A T C G G Sequence 1 C
Introns Gene } Introns are spliced out in the m. RNA } } } m. RNA
Protein dot plots
CLC Combined Workbench
Ankyrin repeat protein
HIV Long Terminal Repeats
Di-nucleotide repeats
Repetitive regions
Exercise: Inverted repeats
Exercise: Inverted repeats Window size 3 Reverse complement C C Make a dot plot with the sequence against the reverse-complement of the sequence. T T Now diagonals represent inverted repeats. T A G G A A A Sequence 1 T C C
Genome dot plots: inverted repeats Analysis of a random sequence of Homo sapiens chromosome 7 reveals numerous short inverted repeats
The human Alu sequence A self-self plot reveals some repetitive regions.
The human Alu sequence A plot of the Alu sequence against its reversecomplement reveals its inverted repeat (palindromic) nature, seen as the diagonal along the entire sequence length
WD-repeat proteins Identity matrix Blosum 45 matrix
Conclusion • Dot plots provide an intuitive view of sequence comparisons. • The sliding window size is important. • For proteins, substitution matrices can be used. • Dot plots can reveal – Repeats – Insertion/Deletions (such as introns) – Inverted repeats