Limitations of Hierarchical and Mixture Model Comparisons By

Limitations of Hierarchical and Mixture Model Comparisons By M. A. C. S. Sampath Fernando Department of Statistics – University of Auckland And James M. Curran Renate Meyer Jo-Anne Bright John S. Buckleton 1

Introduction Ø What is modelling? ü A statistical model is a probabilistic system ü A probability distribution ü A finite/infinite mixture of probability distributions Ø Good models Ø All models are approximations π = 3. 14159 ? π ≈ 3. 14159 π ≈ 3. 14 π≈3 2

Modelling the behaviour of data If we wish to perform statistical inference, or use our model to probabilistically evaluate the behavior of new observations, then we need three steps 1. 2. 3. Assume that the data are generated from some statistical distribution Write down equations for the parameters of the assumed distribution, e. g. the mean and the standard deviation Use standard techniques to estimate the unknown parameters in 2. Steps 1– 3 should be repeated as often as possible to get the “best” model. Most model building consists of steps 2 and 3. Classical and Bayesian approaches ü ü Distributional assumptions on data Model parameters Prior distributions (believes) on model parameters Parameters estimate 3

Bayesian Statistical Models Simple models ü Model parameter(s) ü Prior distribution(s) – distribution(s) model parameter(s) ü Hyperparameter(s) – parameter(s) of prior distribution(s) Hierarchical models ü Hyperprior(s) - prior distribution(s) on hyperparameter(s) ü Hyperhyperparameter(s) – parameter(s) of hyperprior distribution(s) Mixture models ü Assumes heterogeneous two or more unknown sources of data ü Each data source is called a cluster ü Clusters are modelled using parametric models (simple or hierarchical) 4 ü Represents as a weighted average of cluster models

Electropherogram (EPG) 5

Models for stutter Stutters The smallest value of SR is zero, we expect most SR values to be small, with a long tail out to the right Ø Mean SR behavior is affected by the longest uninterrupted sequence of the allele, LUS Ø We expect the values of SR to be more variable for smaller values of Oa 6

Mean and Variance in Stutter Ratio Mean Stutter Ratio Variance in Stutter ratio is inversely proportional allele height A common variance for all the loci - models with profilewide variances Locus specific variance 7

Different models for SR 8

Measures of Predictive Accuracy 9

Information Criterion 10

Information Criterion 11

Results Model Log lik -2 log lik k 2 k Log(n) k penalty Model WAIC 1 Model WAIC 2 LN 0 13350. 6 -26701. 2 33 66 278. 6 41. 1 LN 0 -25141 LN 0 -22973 G 0 14212 66 278. 6 33. 7 G 0 -26947 G 0 -25070 LN 2 14463. 1 -28424. 1 33 48+ -28926. 2 40. 2 LN 2 -27350 LN 2 -25188 LN 1 14463. 9 96 48. 4 LN 1 -27353 LN 1 -25193 G 2 14776. 8 -28927. 8 48 48+ -29553. 5 46. 1 G 2 -28084 T 2 -25764 G 1 14777. 7 -29555. 3 48 96 405. 3 33. 9 G 1 -28091 G 1 -26237 N 0 14877. 3 66 278. 6 44. 9 N 0 -28294 G 2 -26244 N 2 15233. 1 -29754. 6 33 48+ -30466. 1 32. 8 T 2 -28906 N 0 -26501 N 1 15233. 7 96 53. 6 N 2 -28977 T 0 -26525 MLN 2 15276 -30467. 4 48 65+ -30551. 9 N 1 -29000 MN 2 -26539 MLN 1 15276 -30552 65 130 548. 8 MLN 1 -29021 T 1 -26568 T 0 15328. 8 98 413. 7 38. 7 MLN 2 -29141 MT 2 -26569 T 2 15348. 3 -30657. 6 49 64+ -30696. 7 87. 8 T 0 -29343 MN 1 -26589 T 1 15352. 1 128 MN 2 -29348 MT 1 -26604 MN 2 15471. 9 -30704. 2 64 65+ -30943. 8 MT 2 -29362 MLN 2 -27047 MN 1 15473. 3 -30946. 6 65 130 548. 8 MN 1 -29366 MLN 1 -27088 MT 1 15749. 2 162 683. 9 T 1 -29367 N 2 -27172 MT 2 15833. 9 -31498. 4 81 81+ -31667. 7 MT 1 -29378 N 1 -27213 405. 3 540. 4 104. 7 12

Summary ü Posterior predictive checks are very useful in model comparisons (even with completely different models) ü Information criteria are useful under some circumstances ü WAIC is fully Bayesian method and performs better than AIC, BIC and DIC in many aspects Model AIC BIC DIC WAIC Simple Hierarchical Non-hierarchical Mixture Hierarchical Mixture 13

Thank you! 14