Power Calculations for GWAS What is Power Power

What is Power? • Power is the probability that we will detect a true

Relative Risk Disease Healthy Total Risk Allele Count DR HR TR Protective Allele Count

Odds Ratio Disease Healthy Total Risk Allele Count DR HR TR Protective Allele Count

Comparison of Relative Risk and Odds Ratio Minor Allele frequency = 0. 3 The

Assumptions required for Power calculation • Power depends on • • • Inheritance model

Inheritance Models • Mendelian trait • • • Phenotype controlled by a single locus.

Inheritance model • Dominant • These are the easiest to detect as the contribution

Population size • Increasing population size will increase power to detect a given Odds

Multiple Testing • We will be testing 1, 000 s of SNP loci which

Effect of Number of Tests on Power Hong, E. P. & Park, J. W.

Effect of Minor Allele Frequency on Power Hong, E. P. & Park, J. W.

Effect of Disease Prevalence on Power Hong, E. P. & Park, J. W. Sample

Effect of linkage disequilibrium on Power Hong, E. P. & Park, J. W. Sample

Excercise • In the folder “Power Analysis” in your Flash disk there is a

Slides: 15

Download presentation

Power Calculations for GWAS

What is Power? • Power is the probability that we will detect a true association between a SNP and a phenotype • It is common to use a power of 80% or 90% for study design • An 80% power means that there is a 80% chance that the association will be discovered given the assumptions that you have made • The main assumptions are about the p value that is significant, the sample size and the odds ratio or relative risk

Relative Risk Disease Healthy Total Risk Allele Count DR HR TR Protective Allele Count DP HP TP Disease Risk allele Protective Allele Healthy RR = (DR/TR) (DP/TP) Probability of Disease With Risk Allele Probability of Disease With Protective Allele Total 220 180 400 780 820 1600 1000 2000 220/400 0. 550 DR/TR 780/1660 0. 488 DP/TP 1. 128 Relative Risk

Odds Ratio Disease Healthy Total Risk Allele Count DR HR TR Protective Allele Count DP HP TP Disease Healthy Total OR = (DR/HR) (DP/HP) Ratio of Diseased to Healthy with Risk Allele Ratio of Diseased to Healthy With Protective Allele 220/180 1. 222 DR/HR 780/820 0. 951 DP/HP Risk 220 180 400 Protected 780 820 1600 1. 285 Odds ratio 1000 2000 1. 128 Relative Risk

Comparison of Relative Risk and Odds Ratio Minor Allele frequency = 0. 3 The exact relationship will depend on MAF Odds Ratio will always give a larger estimate of effect than Relative Risk

Assumptions required for Power calculation • Power depends on • • • Inheritance model Sample size Number of Independent tests (SNP tested) Minor Allele Frequency Probability of having phenotype if you have the risk allele (odds ratio) Linkage disequilibrium between marker SNP and causative SNP

Inheritance Models • Mendelian trait • • • Phenotype controlled by a single locus. Easiest to detect. All affected will have the risk allele Some unaffected may have risk allele (<100% penetrance) eg Huntingdons’ Disease Easily discovered in small sets of families • Quantative trait • Phenotype controlled by multiple loci. Eg hundreds of loci have been found associated with height. • Not all individuals with phenotype will have the causative allele at a particular locus • A critical mass of causative loci may be required before phenotype develops. • Eg an individual might need 50 risk alleles to develop cancer out of many hundred possible risk alleles • We need to discover a statistically significant excess of an allele in the affected population over the control population • We are measuring the increased risk of exhibiting the phenotype associated with each variant

Inheritance model • Dominant • These are the easiest to detect as the contribution to risk from a heterozygote will the same as homozygotes. Therefore the association with the locus will be stronger. • These were made famous by Mendel’s peas but I am not aware of any examples associated with Quantative traits and we will ignore them • Additive (Co-dominant) • Both alleles contribute approximately the same amount to risk of disease. • This is the commonest mode of inheritance and the one that we will assume. • Recessive • This is the hardest to detect since only homozygotes will be at risk of disease and these may be rare in the population. • Eg An allele present at 10% frequency will be homozygous in only 1% of the population (Hardy Weinberg) • Can be maintained by balancing selection Eg sickle cell anaemia

Population size • Increasing population size will increase power to detect a given Odds Ratio This shows how power increase with population size for a small Odds ratio. The plot is highly dependent on the particular Odds Ratio Chosen

Multiple Testing • We will be testing 1, 000 s of SNP loci which are assumed to have independent effects • If we test 100 loci using a 5% alpha then we would expect to get 5 positive associations even if all the data was completely random. • We will use the Bonferroni correction to control for this • Divide the alpha by the number of loci tested • If we use 100 SNP loci then we would set the required alpha to 0. 05/100 = 0. 0005

Effect of Number of Tests on Power Hong, E. P. & Park, J. W. Sample Size and Statistical Power Calculation in Genetic Association Studies. Genomics Inform 10, 117 (2012).

Effect of Minor Allele Frequency on Power Hong, E. P. & Park, J. W. Sample Size and Statistical Power Calculation in Genetic Association Studies. Genomics Inform 10, 117 (2012).

Effect of Disease Prevalence on Power Hong, E. P. & Park, J. W. Sample Size and Statistical Power Calculation in Genetic Association Studies. Genomics Inform 10, 117 (2012).

Effect of linkage disequilibrium on Power Hong, E. P. & Park, J. W. Sample Size and Statistical Power Calculation in Genetic Association Studies. Genomics Inform 10, 117 (2012).

Excercise • In the folder “Power Analysis” in your Flash disk there is a word doc “Power Analysis. docx”. Please open it and follow the instructions