Sequence Alignment n Global Alignment compare two sequences
Sequence Alignment n Global Alignment: compare two sequences in their entirety; the gap penalty is assessed regardless of whether gaps are located internally within a sequence, or at the end of one or both sequences. The Needleman and Wunsch Algorithm n Local Alignment: find best matching subsequences within the two search sequences. The Smith-Waterman Algorithm.
Sequence Alignment n Semi-Global Alignment: different treatment of terminal (end) gaps. Terminal Gaps are usually the result of incomplete data and do not have biological significance. Example: searching the best alignment between the short sequence and entire genome. Modification of Needleman and Wunsch Algorithm.
Algorithm Design Techniques n Exhaustive Search (brute force) algorithm examines every possible alternative to find one particular solution n Dynamic Programming Algorithm breaks the problem into smaller sub-problems and uses the solutions of the sub-problems to construct the solution of the larger problem.
Needleman and Wunsch Algorithm n Input: two strings X = x 1…x. M and Y = y 1…y. N and scoring rules: scoring matrix s and gap penalty GP n Output: An alignment of X and Y whose score as defined by scoring rules is maximal among all possible alignments of X and Y
n Termination: F(M, N) is an optimal score
n Finding the optimal alignment: q q q Every non-decreasing path from (0, 0) to (M, N) corresponds to an global alignment of the two sequences. Use Trace. Back. P starting at (M, N) to trace back an optimal alignment case 1: xi aligns to yj case 2: xi aligns to a gap case 3: yj aligns to a gap
Global Alignment Example n n Find the optimal global alignment of AACT and ACG. Scoring rules: match = 1, mismatch = 0, gap penalty GP = -1 Optimal Alignments: Alignment 1 score = 1 AACT | | - A CG A C G 0 -1 -2 -3 A -1 1 0 -1 A -2 0 1 0 C -3 -1 1 1 T -4 -2 0 1 Alignment 2 score = 1 AACT | | A - CG
Smith-Waterman Algorithm n Input: Strings X and Y and scoring rules: scoring matrix s and gap penalty GP. n Output: Substrings of X and Y whose global alignment, as defined by scoring rules is maximal among all global alignments of all substrings of X and Y.
n n Largest value of F(i, j) represents the score of the best local alignment of X and Y Traceback begins at the highest score in the matrix and continues until you reach 0.
Local Alignment Example n n n Find the optimal local alignment of AACT and ACG. Scoring rules match = 1, mismatch = 0, gap penalty GP = -1 Solution: Local Alignment Score = 2 AC | | AC A C G 0 0 A 0 1 1 0 C 0 0 2 1 T 0 0 1 2
- Slides: 10