Dot Plots DNA dot plots Identification of regions

  • Slides: 28
Download presentation
Dot Plots

Dot Plots

DNA dot plots Identification of regions of – Similarity between two sequences – Insertions-deletions:

DNA dot plots Identification of regions of – Similarity between two sequences – Insertions-deletions: Introns – Repetitive regions (self-self analysis) – Inverted repeats

Repeats • All DNA sequences contain repeats

Repeats • All DNA sequences contain repeats

Repeats • All DNA sequences contain repeats

Repeats • All DNA sequences contain repeats

Window size • Window size 1

Window size • Window size 1

Window size • Window size 9

Window size • Window size 9

Exercise Practice for, a) window size 1 b) window size 3 C C Sequence

Exercise Practice for, a) window size 1 b) window size 3 C C Sequence 2 T A A A G G A A A Sequence 1 T C C

Exercise Window size 1 C Identity C Sequence 2 T A A A G

Exercise Window size 1 C Identity C Sequence 2 T A A A G G A A A Sequence 1 T C C

Exercise Window size 3 C Not considered C Sequence 2 T A A A

Exercise Window size 3 C Not considered C Sequence 2 T A A A G G A A A Sequence 1 T C C

Exercise Window size 3 C C GGA Sequence 2 T A A = 3

Exercise Window size 3 C C GGA Sequence 2 T A A = 3 / 3 identities A G 3 G G G A A A Sequence 1 T C C

Exercise Window size 3 C C GGA GAA Sequence 2 T A A =

Exercise Window size 3 C C GGA GAA Sequence 2 T A A = 2 / 3 identities A 2 G 3 G G G A A A Sequence 1 T C C

Exercise Window size 3 C C GGA AAA Sequence 2 T A A 1

Exercise Window size 3 C C GGA AAA Sequence 2 T A A 1 A 2 G 3 = 1 / 3 identities G G G A A A Sequence 1 T C C

Exercise Window size 3 C C GGA AAT Sequence 2 T A 0 A

Exercise Window size 3 C C GGA AAT Sequence 2 T A 0 A 1 A 2 G 3 = 0 / 3 identities G G G A A A Sequence 1 T C C

Exercise Window size 3 Sequence 2 C C 0 0 1 3 T 0

Exercise Window size 3 Sequence 2 C C 0 0 1 3 T 0 0 1 1 3 1 A 0 1 2 3 1 0 A 1 2 3 2 1 0 A 2 3 2 1 0 0 G 3 2 1 0 0 0 G A A A T C G G Sequence 1 C

Introns Gene } Introns are spliced out in the m. RNA } } }

Introns Gene } Introns are spliced out in the m. RNA } } } m. RNA

Protein dot plots

Protein dot plots

CLC Combined Workbench

CLC Combined Workbench

Ankyrin repeat protein

Ankyrin repeat protein

HIV Long Terminal Repeats

HIV Long Terminal Repeats

Di-nucleotide repeats

Di-nucleotide repeats

Repetitive regions

Repetitive regions

Exercise: Inverted repeats

Exercise: Inverted repeats

Exercise: Inverted repeats Window size 3 Reverse complement C C Make a dot plot

Exercise: Inverted repeats Window size 3 Reverse complement C C Make a dot plot with the sequence against the reverse-complement of the sequence. T T Now diagonals represent inverted repeats. T A G G A A A Sequence 1 T C C

Genome dot plots: inverted repeats Analysis of a random sequence of Homo sapiens chromosome

Genome dot plots: inverted repeats Analysis of a random sequence of Homo sapiens chromosome 7 reveals numerous short inverted repeats

The human Alu sequence A self-self plot reveals some repetitive regions.

The human Alu sequence A self-self plot reveals some repetitive regions.

The human Alu sequence A plot of the Alu sequence against its reversecomplement reveals

The human Alu sequence A plot of the Alu sequence against its reversecomplement reveals its inverted repeat (palindromic) nature, seen as the diagonal along the entire sequence length

WD-repeat proteins Identity matrix Blosum 45 matrix

WD-repeat proteins Identity matrix Blosum 45 matrix

Conclusion • Dot plots provide an intuitive view of sequence comparisons. • The sliding

Conclusion • Dot plots provide an intuitive view of sequence comparisons. • The sliding window size is important. • For proteins, substitution matrices can be used. • Dot plots can reveal – Repeats – Insertion/Deletions (such as introns) – Inverted repeats