Pinpointing Uncertainty Comparing competing hypotheses tests of trees

  • Slides: 19
Download presentation
Pinpointing Uncertainty

Pinpointing Uncertainty

Comparing competing hypotheses - tests of trees phylogenetic two (or more) • Particularly useful

Comparing competing hypotheses - tests of trees phylogenetic two (or more) • Particularly useful techniques are those designed to allow evaluation of alternative phylogenetic hypotheses • Several such tests allow us to determine if one tree is statistically significantly worse than another: Winning sites, Templeton, Kishino-Hasegawa, parametric bootstrapping (SOWH) Shimodaira-Hasegawa, Approximately Unbiased

Tests of two trees • Tests are of the null hypothesis that the differences

Tests of two trees • Tests are of the null hypothesis that the differences between two trees (A and B) are no greater than expected from sampling error • The simplest ‘wining sites’ test sums the number of sites supporting tree A over tree B and vice versa (those having fewer steps on, and better fit to, one of the trees) • Under the null hypothesis characters are equally likely to support tree A or tree B and a binomial distribution gives the probability of the observed difference in numbers of winning sites

The Templeton test • Templeton’s test is a non-parametric Wilcoxon signed ranks test of

The Templeton test • Templeton’s test is a non-parametric Wilcoxon signed ranks test of the differences in fits of characters to two trees • It is like the ‘winning sites’ test but also takes into account the magnitudes of differences in the support of characters for the two trees

Eosauropterygia 2 Placodus Lepidosauriformes Archosauromorpha Younginiformes Claudiosaurus Araeoscelidia Paleothyris Captorhinidae Parareptilia 1 Synapsida Diadectomorpha

Eosauropterygia 2 Placodus Lepidosauriformes Archosauromorpha Younginiformes Claudiosaurus Araeoscelidia Paleothyris Captorhinidae Parareptilia 1 Synapsida Diadectomorpha Seymouriadae Templeton’s test - an example Recent studies of the relationships of turtles using morphological data have produced very different results with turtles grouping either within the parareptiles (H 1) or within the diapsids (H 2) the result depending on the morphologist This suggests there may be: - problems with the data - special problems with turtles - weak support for turtle relationships Parsimony analysis of the most recent data favoured H 2 However, analyses constrained by H 2 produced trees that required only 3 extra steps (<1% tree length) The Templeton test was used to evaluate the trees and showed that the slightly longer H 1 tree found in the constrained analyses was not significantly worse than the unconstrained H 2 tree The morphological data do not allow choice between H 1 and H 2

Kishino-Hasegawa test • The Kishino-Hasegawa test is similar in using differences in the support

Kishino-Hasegawa test • The Kishino-Hasegawa test is similar in using differences in the support provided by individual sites for two trees to determine if the overall differences between the trees are significantly greater than expected from random sampling error • It is a parametric test that depends on assumptions that the characters are independent and identically distributed (the same assumptions underlying the statistical interpretation of bootstrapping) • It can be used with parsimony and maximum likelihood - implemented in PHYLIP and PAUP*

Kishino-Hasegawa test Sites favouring tree A Mean Expected Sites favouring tree B 0 Distribution

Kishino-Hasegawa test Sites favouring tree A Mean Expected Sites favouring tree B 0 Distribution of Step/Likelihood differences at each site Under the null hypothesis the mean of the differences in parsimony steps or likelihoods for each site is expected to be zero, and the distribution normal From observed differences we calculate a standard deviation If the difference between trees (tree lengths or likelihoods) is attributable to sampling error, then characters will randomly support tree A or B and the total difference will be close to zero The observed difference is significantly greater than zero if it is greater than 1. 95 standard deviations This allows us to reject the null hypothesis and declare the suboptimal tree significantly worse than the optimal tree (p < 0. 05)

Kishino-Hasegawa test Ochromonas Symbiodinium Prorocentrum Sarcocystis Theileria Ciliate SSUr. DNA Plagiopyla n Plagiopyla f

Kishino-Hasegawa test Ochromonas Symbiodinium Prorocentrum Sarcocystis Theileria Ciliate SSUr. DNA Plagiopyla n Plagiopyla f Trimyema c Trimyema s Cyclidium p Cyclidium g Paramecium Colpoda Opisthonecta Dasytrichia Entodinium Spathidium Loxophylum Homalozoon Metopus c Metopus p Stylonychia Onychodromous Oxytrichia Loxodes Tracheloraphis Spirostomum Gruberia Blepharisma Maximum likelihood tree Cyclidium l Glaucoma Colpodinium Tetrahymena Parsimonious character optimization of the presence and absence of hydrogenosomes suggests four separate origins of within the ciliates Discophrya Trithigmostoma Questions - how reliable is this result? - in particular how well supported is the idea of multiple origins? - how many origins can we confidently infer? anaerobic ciliates with hydrogenosomes

Kishino-Hasegawa test Ochromonas Symbiodinium Prorocentrum Sarcocystis Theileria Plagiopyla n Plagiopyla f Trimyema c Trimyema

Kishino-Hasegawa test Ochromonas Symbiodinium Prorocentrum Sarcocystis Theileria Plagiopyla n Plagiopyla f Trimyema c Trimyema s Cyclidium p Cyclidium g Cyclidium l Dasytrichia Entodinium Loxophylum Homalozoon Spathidium Metopus c Metopus p Loxodes Tracheloraphis Spirostomum Gruberia Blepharisma Discophrya Trithigmostoma Stylonychia Onychodromous Oxytrichia Colpoda Paramecium Glaucoma Colpodinium Tetrahymena Opisthonecta Ochromonas Symbiodinium Prorocentrum Sarcocystis Theileria Plagiopyla n Plagiopyla f Trimyema c Trimyema s Cyclidium p Metopus c Metopus p Dasytrichia Entodinium Cyclidium g Cyclidium l Loxophylum Spathidium Homalozoon Loxodes Tracheloraphis Spirostomum Gruberia Blepharisma Discophrya Trithigmostoma Stylonychia Onychodromous Oxytrichia Colpoda Paramecium Glaucoma Colpodinium Tetrahymena Opisthonecta Two topological constraint trees Parsimony analyse with topological constraints found the shortest trees forcing hydrogenosomal ciliate lineages together, thereby reducing the number of separate origins of hydrogenosomes Each of the constrained parsimony trees were compared to the ML tree and the Kishino-Hasegawa test used to determine which of these trees were significantly worse than the ML tree

Kishino-Hasegawa test Test summary and results (simplified) No. Origins 4 4 3 3 3

Kishino-Hasegawa test Test summary and results (simplified) No. Origins 4 4 3 3 3 2 2 2 2 1 Constraint tree ML MP (cp, pt) (cp, rc) (cp, m) (pt, rc) (pt, m) (rc, m) (pt, cp, rc) (pt, rc, m) (pt, cp, m) (cp, rc, m) (pt, cp)(rc, m) (pt, m)(rc, cp) (pt, rc)(cp, m) (pt, cp, m, rc) Extra Steps +10 +13 +113 +47 +96 +22 +63 +123 +100 +40 +124 +77 +131 +140 +131 Difference and SD -13 18 -21 22 -337 40 -147 36 -279 38 -68 29 -190 34 -432 40 -353 43 -140 37 -466 49 -222 39 -442 48 -414 50 -515 49 Significantly worse? No No Yes Yes Yes Yes Constrained analyses used to find most parsimonious trees with less than four separate origins of hydrogenosomes Tested against ML tree Trees with 2 or 1 origin are all significantly worse than the ML tree We can confidently conclude that there have been at least three separate origins of hydrogenosomes within the sampled ciliates

Problems with tests of trees • To be statistically valid, the Kishino-Hasegawa test should

Problems with tests of trees • To be statistically valid, the Kishino-Hasegawa test should be of trees that are selected a priori • However, most applications have used trees selected a posteriori on the basis of the phylogenetic analysis • Where we test the ‘best’ tree against some other tree the KH test will be biased towards rejection of the null hypothesis • Only if null hypothesis is not rejected will result be safe from some unknown level of bias

Problems with tests of trees • The Shimodaira-Hasegawa test is a more statistically correct

Problems with tests of trees • The Shimodaira-Hasegawa test is a more statistically correct technique for testing trees selected a posteriori and is implemented in PAUP* • However it requires selection of a set of plausible topologies - hard to give practical advice • Parametric bootstrapping (SOWH test) is an alternative - but it is harder to implement and may suffer from an opposite bias due to model misspecification • The Approximately Unbiased test (implemented in CONSEL) may be the best option currently

Problems with tests of trees

Problems with tests of trees

Taxonomic Congruence • Trees inferred from different data sets (different genes, morphology) should agree

Taxonomic Congruence • Trees inferred from different data sets (different genes, morphology) should agree if they are accurate • Congruence between trees is best explained by their accuracy • Congruence can be investigated using consensus (and supertree) methods • Incongruence requires further work to explain or resolve disagreements

Reliability of Phylogenetic Methods • Phylogenetic methods (e. g. parsimony, distance, ML) can also

Reliability of Phylogenetic Methods • Phylogenetic methods (e. g. parsimony, distance, ML) can also be evaluated in terms of their general performance, particularly their: consistency - approach the truth with more data efficiency - how quickly (how much data) robustness - sensitivity to violations of assumptions • Studies of these properties can be analytical or by simulation

Reliability of Phylogenetic Methods • There have been many arguments that ML methods are

Reliability of Phylogenetic Methods • There have been many arguments that ML methods are best because they have desirable statistical properties, such as consistency • However, ML does not always have these properties – if the model is wrong/inadequate (fortunately this is testable to some extent) – properties not yet demonstrated for complex inference problems such as phylogenetic trees

Reliability of Phylogenetic Methods • “Simulations show that ML methods generally outperform distance and

Reliability of Phylogenetic Methods • “Simulations show that ML methods generally outperform distance and parsimony methods over a broad range of realistic conditions” Whelan et al. 2001 Trends in Genetics 17: 262 -272 • But… • Most simulations cover a narrow range of very (unrealistically) simple conditions – few taxa (typically just four!) – few parameters (standard models - JC, K 2 P etc)

Reliability of Phylogenetic Methods • Simulations with four taxa have shown: - Model based

Reliability of Phylogenetic Methods • Simulations with four taxa have shown: - Model based methods - distance and maximum likelihood perform well when the model is accurate (not surprising!) - Violations of assumptions can lead to inconsistency for all methods (a Felsenstein zone) when branch lengths or rates are highly unequal - Maximum likelihood methods are quite robust to violations of model assumptions - Weighting can improve the performance of parsimony (reduce the size of the Felsenstein zone)

Reliability of Phylogenetic Methods • However: - Generalising from four taxon simulations may be

Reliability of Phylogenetic Methods • However: - Generalising from four taxon simulations may be dangerous as conclusions may not hold for more complex cases - A few large scale simulations (many taxa) have suggested that parsimony can be very accurate and efficient - Most methods are accurate in correctly recovering known phylogenies produced in laboratory studies • More realistic simulations are needed if they are to help in choosing/understanding methods • You can try your own…