MAT 4830 Mathematical Modeling 4 5 Phylogenetic Distances

  • Slides: 32
Download presentation
MAT 4830 Mathematical Modeling 4. 5 Phylogenetic Distances I http: //myhome. spu. edu/lauw

MAT 4830 Mathematical Modeling 4. 5 Phylogenetic Distances I http: //myhome. spu. edu/lauw

Preview l l Phylogenetic: of or relating to the evolutionary development of organisms Estimate

Preview l l Phylogenetic: of or relating to the evolutionary development of organisms Estimate the amount of total mutations (observed and hidden mutations).

Example from 4. 1 S 0 : Ancestral sequence S 1 : Descendant of

Example from 4. 1 S 0 : Ancestral sequence S 1 : Descendant of S 0 S 2 : Descendant of S 1

Example from 4. 1 S 0 : Ancestral sequence S 1 : Descendant of

Example from 4. 1 S 0 : Ancestral sequence S 1 : Descendant of S 0 S 2 : Descendant of S 1 Observed mutations: 2

Example from 4. 1 S 0 : Ancestral sequence S 1 : Descendant of

Example from 4. 1 S 0 : Ancestral sequence S 1 : Descendant of S 0 S 2 : Descendant of S 1 Actual mutations: 5

Example from 4. 1 S 0 : Ancestral sequence S 1 : Descendant of

Example from 4. 1 S 0 : Ancestral sequence S 1 : Descendant of S 0 S 2 : Descendant of S 1 Actual mutations: 5, (some are hidden mutations)

Distance of Two Sequences l l We want to define the “distance” between two

Distance of Two Sequences l l We want to define the “distance” between two sequences. It measures the average no. of mutations per site that occurred, including the hidden ones.

Distance of Two Sequences l 1. 2. 3. Let d(S 0, S) be the

Distance of Two Sequences l 1. 2. 3. Let d(S 0, S) be the distance between sequences S 0 and S. What properties it “should” have?

Jukes-Cantor Model l l Assume α is small. Mutations per time step are “rare”.

Jukes-Cantor Model l l Assume α is small. Mutations per time step are “rare”.

Jukes-Cantor Model l q(t)=conditional prob. that the base at time t is the same

Jukes-Cantor Model l q(t)=conditional prob. that the base at time t is the same as the base at time 0

Jukes-Cantor Model l q(t)=fraction of sites with no observed mutations

Jukes-Cantor Model l q(t)=fraction of sites with no observed mutations

Jukes-Cantor Model l p(t)=1 -q(t)=fractions of sites with observed mutations

Jukes-Cantor Model l p(t)=1 -q(t)=fractions of sites with observed mutations

Jukes-Cantor Model l p(t)=1 -q(t)=fractions of sites with observed mutations

Jukes-Cantor Model l p(t)=1 -q(t)=fractions of sites with observed mutations

Jukes-Cantor Model l p can be estimated from the two sequences

Jukes-Cantor Model l p can be estimated from the two sequences

Example from 4. 1 Observed mutations: 2

Example from 4. 1 Observed mutations: 2

Jukes-Cantor Distance l Given p (and t), the J-C distance between two sequences S

Jukes-Cantor Distance l Given p (and t), the J-C distance between two sequences S 0 and S 1 is defined as

Jukes-Cantor Distance l Given p (and t), the J-C distance between two sequences S

Jukes-Cantor Distance l Given p (and t), the J-C distance between two sequences S 0 and S 1 is defined as

Jukes-Cantor Distance

Jukes-Cantor Distance

Jukes-Cantor Distance

Jukes-Cantor Distance

Jukes-Cantor Distance

Jukes-Cantor Distance

Example from 4. 3 Suppose a 40 -base ancestral and descendent DNA sequences are

Example from 4. 3 Suppose a 40 -base ancestral and descendent DNA sequences are

Example from 4. 3 Suppose a 40 -base ancestral and descendent DNA sequences are

Example from 4. 3 Suppose a 40 -base ancestral and descendent DNA sequences are

Example from 4. 3 0. 275 observed sub. per site. 0. 3426 sub. estimated

Example from 4. 3 0. 275 observed sub. per site. 0. 3426 sub. estimated per site.

Example from 4. 3 11 observed sub. 13. 7 sub. estimated.

Example from 4. 3 11 observed sub. 13. 7 sub. estimated.

Performance of JC distance (Homework Problem 4) l Write a program to simulate of

Performance of JC distance (Homework Problem 4) l Write a program to simulate of the mutations of a sequence for t time step using the Jukes. Cantor model with parameter α.

Performance of JC distance (Homework Problem 4) l l Write a program to simulate

Performance of JC distance (Homework Problem 4) l l Write a program to simulate of the mutations of a sequence for t time step using the Jukes. Cantor model with parameter α. Count the number of base substitutions occurred.

Performance of JC distance (Homework Problem 4) l l l Write a program to

Performance of JC distance (Homework Problem 4) l l l Write a program to simulate of the mutations of a sequence for t time step using the Jukes. Cantor model with parameter α. Count the number of base substitutions occurred. Compute the Jukes-Cantor distance of the initial and finial sequence.

Performance of JC distance (Homework Problem 4) l l Write a program to simulate

Performance of JC distance (Homework Problem 4) l l Write a program to simulate of the mutations of a sequence for t time step using the Jukes. Cantor model with parameter α. Count the number of base substitutions occurred. Compute the Jukes-Cantor distance of the initial and finial sequence. Compare the actual number of base substitutions and the estimation from the Jukes -Cantor distance.

Performance of JC distance (Homework Problem 4)

Performance of JC distance (Homework Problem 4)

Maple: Strings Handling II l Concatenating two strings

Maple: Strings Handling II l Concatenating two strings

Maple: Strings Handling II l However, no “re-assignment”.

Maple: Strings Handling II l However, no “re-assignment”.

Classwork l Work on HW #1, 2

Classwork l Work on HW #1, 2