TCoffee Whats New in The Grinder Mixing MSAs

  • Slides: 50
Download presentation
T-Coffee: What’s New in The Grinder Mixing MSAs, Sequences and Structures Cédric Notredame Information

T-Coffee: What’s New in The Grinder Mixing MSAs, Sequences and Structures Cédric Notredame Information Génétique et Structurale CNRS-Marseille, France

What’s in a Multiple Alignment? l Structural Criteria – l Evolutive Criteria – l

What’s in a Multiple Alignment? l Structural Criteria – l Evolutive Criteria – l Residues are arranged so that those playing a similar role end up in the same column. Residues are arranged so that those having the same ancestor end up in the same column. Similarity Criteria – As many similar residues as possible in the same column

What’s in a Multiple Alignment? l l The MSA contains what you put inside…

What’s in a Multiple Alignment? l l The MSA contains what you put inside… You can view your MSA as: – – – A record of evolution A summary of a protein family A collection of experiments made for you by Nature…

Multiple Alignments: What Are They Good For? ? ?

Multiple Alignments: What Are They Good For? ? ?

Computing the Correct Alignement is a Complicated Problem

Computing the Correct Alignement is a Complicated Problem

Off the Shelf Methods

Off the Shelf Methods

A Taxonomy of Multiple Sequence Alignment Packages APPROXIMATE FAST ACCURATE SLOW Entropy

A Taxonomy of Multiple Sequence Alignment Packages APPROXIMATE FAST ACCURATE SLOW Entropy

Three Types of Algorithms l Progressive: Clustal. W l Iterative: Muscle l Concistency Based:

Three Types of Algorithms l Progressive: Clustal. W l Iterative: Muscle l Concistency Based: T-Coffee and Probcons

Clustal. W

Clustal. W

Clustal. W

Clustal. W

Muscle Algorithm: Using The Iteration

Muscle Algorithm: Using The Iteration

Concistency Based Algorithms: T-Coffee l Gotoh (1990) – l Martin Vingron (1991) – –

Concistency Based Algorithms: T-Coffee l Gotoh (1990) – l Martin Vingron (1991) – – l Concistency Agglomerative Assembly T-Coffee (2000, Notredame) – – l Dot Matrices Multiplications Accurate but too stringeant Dialign (1996, Morgenstern) – – l Iterative strategy using concistency Concistency Progressive algorithm Prob. Cons (2004, Do) – T-Coffee with a Bayesian Treatment

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency…

T-Coffee and Concistency… l Each Library Line is a Soft Constraint (a wish) l

T-Coffee and Concistency… l Each Library Line is a Soft Constraint (a wish) l You can’t satisfy them all l You must satisfy as many as possible (The easy ones)

Validation Using Bali. Base T-Coffee Results

Validation Using Bali. Base T-Coffee Results

T-Coffee and Concistency…

T-Coffee and Concistency…

Evaluating Methods… Who is the best? Says who…?

Evaluating Methods… Who is the best? Says who…?

Structures Vs Sequences

Structures Vs Sequences

Who is the Best ? ? ? N T-Coffee Probcons Clustal. W Muscle Hom+50

Who is the Best ? ? ? N T-Coffee Probcons Clustal. W Muscle Hom+50 40 49. 71 51. 59 36. 77 46. 90 SABs+50 209 21. 85 22. 53 12. 34 19. 61 SABf+50 425 45. 18 44. 85 34. 95 38. 17 Prefab 1675 67. 96 67. 95 59. 45 66. 05

The Alignments Methods MAFFT

The Alignments Methods MAFFT

Too Many Methods for ONE Alignment M-Coffee

Too Many Methods for ONE Alignment M-Coffee

Combining Many MSAs into ONE Clustal. W MAFFT T-Coffee MUSCLE ? ? ? ?

Combining Many MSAs into ONE Clustal. W MAFFT T-Coffee MUSCLE ? ? ? ?

Combining Many MSAs into ONE

Combining Many MSAs into ONE

The Right Mixt of Methods

The Right Mixt of Methods

Resisting Noise M-Coffee 8

Resisting Noise M-Coffee 8

Going Further

Going Further

Place your Bets…

Place your Bets…

www. tcoffee. org www. vital-it. ch/prd/smoretti/cgi-bin/Tcoffee/tcoffee_cgi/index. cgi

www. tcoffee. org www. vital-it. ch/prd/smoretti/cgi-bin/Tcoffee/tcoffee_cgi/index. cgi

When Sequences Are not Enough 3 D-Coffee and Expresso

When Sequences Are not Enough 3 D-Coffee and Expresso

3 D-Coffee: Combining Sequences and Structures Within Multiple Sequence Alignments

3 D-Coffee: Combining Sequences and Structures Within Multiple Sequence Alignments

 • Threading: 1 -Select 967 pairs of sequences in HOMSTRAD Fugue wins 2

• Threading: 1 -Select 967 pairs of sequences in HOMSTRAD Fugue wins 2 -Align each pair with T-Coffee and Fugue. 3 -Compare the Two Alignments TCdef wins Fugue TCdef: 58. 81% Fugue: 61. 81%

 • Superposition: SAP 1 -Select 967 pairs of sequences in HOMSTRAD 2 -Align

• Superposition: SAP 1 -Select 967 pairs of sequences in HOMSTRAD 2 -Align each pair with T-Coffee and SAP. 3 -Compare the Two Alignments TCdef: 58. 81% SAP: 86. 31%

3 D-Coffee: Combining Sequences and Structures Within Multiple Sequence Alignments

3 D-Coffee: Combining Sequences and Structures Within Multiple Sequence Alignments

The More Structures The Merrier Average Improvement over T-Coffee Struc/Seq Ratio

The More Structures The Merrier Average Improvement over T-Coffee Struc/Seq Ratio

Expresso: Finding the Right Structure

Expresso: Finding the Right Structure

Expresso: Finding the Right Structure Why Not Using Structure Based Alignments

Expresso: Finding the Right Structure Why Not Using Structure Based Alignments

Expresso: Finding the Right Structure Sources BLAST SAP Templates Template Alignment Source Template Alignment

Expresso: Finding the Right Structure Sources BLAST SAP Templates Template Alignment Source Template Alignment Remove Templates Library

14% Correct >1 aaza >1 ego >1 thx >2 trxa >3 trx >3 grx

14% Correct >1 aaza >1 ego >1 thx >2 trxa >3 trx >3 grx 1 DE 2 A 1 EGR 1 THX 2 BTOT 4 TRX 3 GRX 50% Correct

Conclusion l The best Recipy For Good Sequence Alignments Structures!!! l A Better Recipy

Conclusion l The best Recipy For Good Sequence Alignments Structures!!! l A Better Recipy More Structures!!!

Conclusion l l l Concistency Based Methods Have an Edge Hard to tell Methods

Conclusion l l l Concistency Based Methods Have an Edge Hard to tell Methods Apart Sequence Alignment is NOT solved

www. tcoffee. org l l l l Fabrice Armougom (CNRS) Sebastien Moretti (CNRS) Olivier

www. tcoffee. org l l l l Fabrice Armougom (CNRS) Sebastien Moretti (CNRS) Olivier Poirot (CNRS) Frederic Reinier (CNRS, CRS 4) Karsten Suhre (CNRS) Vladimir Saudek (Sanofi-Aventis) Des Higgins (UCD) Orla O’Sullivan (UCD) Iain Wallace (UCD) Bruno Nyfler (Vital. IT) Victor Jongeneel (SIB, Vital. IT) Roger Hersch (EPFL) Pierre Dumas (EPFL) Basile Schaeli (EPFL) cedric. notredame@europe. com

Cadrie Notredom et Michael Claverie

Cadrie Notredom et Michael Claverie