Genetic Epidemiology M Tevfik DORAK http www dorak

  • Slides: 64
Download presentation
Genetic Epidemiology M. Tevfik DORAK http: //www. dorak. info/epi/genetepi. html

Genetic Epidemiology M. Tevfik DORAK http: //www. dorak. info/epi/genetepi. html

Approaches to the identification of susceptibility genes Rebbeck TR. Cancer 1999 (www)

Approaches to the identification of susceptibility genes Rebbeck TR. Cancer 1999 (www)

Palmer LJ. Webcast (www)

Palmer LJ. Webcast (www)

GENETIC EPIDEMIOLOGIC RESEARCH METHODS Handbook of Statistical Genetics (John Wiley & Sons) Fig. 28

GENETIC EPIDEMIOLOGIC RESEARCH METHODS Handbook of Statistical Genetics (John Wiley & Sons) Fig. 28 -1 (www)

GENETIC EPIDEMIOLOGY Flow of research Disease characteristics: Familial clustering: Genetic or environmental: Mode of

GENETIC EPIDEMIOLOGY Flow of research Disease characteristics: Familial clustering: Genetic or environmental: Mode of inheritance: Disease susceptibility loci: Disease susceptibility markers: Descriptive epidemiology Family aggregation studies Twin/adoption/half-sibling/migrant studies Segregation analysis Linkage analysis Association studies

Autosomal recessive disorders are usually common in populations with high level of inbreeding (restricted

Autosomal recessive disorders are usually common in populations with high level of inbreeding (restricted gene pool). Examples are Tangier disease in Tangier Island off the coast of Virginia, USA; many genetic disorders in Ashkenazi Jews (Tay-Sachs disease, Gaucher disease, Fanconi anaemia, Niemann-Pick disease); congenital adrenal hyperplasia (CAH) due to 21 -hydroxylase deficiency in Yupik Eskimos; CAH due to 11 -beta hydroxylase deficiency in Moroccan Jews; and thalassaemias (beta & alpha) in Cyprus and Sardinia Populations like Finland, Iceland Newfoundland exhibit an increased prevalence of rare recessive diseases (congenital nephrotic syndrome of the Finnish type and Newfoundland rod-cone dystrophy)

Study Designs in Genetic Epidemiology * nuclear families (index case and parents) * affected

Study Designs in Genetic Epidemiology * nuclear families (index case and parents) * affected relative pairs (sibs, cousins, any two members of the family) * extended pedigrees * twins (monozygotic and dizygotic) * unrelated population samples

GENETIC EPIDEMIOLOGY Flow of research Disease characteristics: Familial clustering: Genetic or environmental: Mode of

GENETIC EPIDEMIOLOGY Flow of research Disease characteristics: Familial clustering: Genetic or environmental: Mode of inheritance: Disease susceptibility loci: Disease susceptibility markers: Descriptive epidemiology Family aggregation studies Twin/adoption/half-sibling/migrant studies Segregation analysis Linkage analysis Association studies

Risk Ratio (Lambda) Genetics in Clinical Research (www)

Risk Ratio (Lambda) Genetics in Clinical Research (www)

Risk Ratio (Lambda) Genetics in Clinical Research (www)

Risk Ratio (Lambda) Genetics in Clinical Research (www)

Sibling Recurrence Risk / Sibling Risk Ratio (l. S ) Curnow & Smith: J

Sibling Recurrence Risk / Sibling Risk Ratio (l. S ) Curnow & Smith: J Roy Stat Soc 1975; 138: 139 -169

ROCHE Genetic Education (www)

ROCHE Genetic Education (www)

GENETIC EPIDEMIOLOGY Flow of research Disease characteristics: Familial clustering: Genetic or environmental: Mode of

GENETIC EPIDEMIOLOGY Flow of research Disease characteristics: Familial clustering: Genetic or environmental: Mode of inheritance: Disease susceptibility loci: Disease susceptibility markers: Descriptive epidemiology Family aggregation studies Twin/adoption/half-sibling/migrant studies Segregation analysis Linkage analysis Association studies

(Mac. Gregor, 2000)

(Mac. Gregor, 2000)

ROCHE Genetic Education (www)

ROCHE Genetic Education (www)

ROCHE Genetic Education (www)

ROCHE Genetic Education (www)

Adoption Studies 1. Compare the risk in biological relatives with adopted relatives of affected

Adoption Studies 1. Compare the risk in biological relatives with adopted relatives of affected adoptees (beware of adoption bias) 2. Compare the risk in biological relatives with adopted relatives of unaffected adoptees

Migrant Studies Liao CK et al. Endometrial cancer in Asian migrants to the United

Migrant Studies Liao CK et al. Endometrial cancer in Asian migrants to the United States and their descendants. Cancer Causes Control 2003; 14: 357 -60 (www) Flood DM et al. Colorectal cancer incidence in Asian migrants to the United States and their descendants. Cancer Causes Control 2000; 11: 403 -11 (www) Feltbower RG et al. Trends in the incidence of childhood diabetes in south Asians and other children in Bradford, UK. Diabet Med 2002; 19: 162 -6 (www) “ Children in south Asia have a low incidence of type 1 diabetes but migrants to the UK have similar overall rates to the indigenous population. However, a more steeply rising incidence is seen in the south Asian population, and our data suggest that incidence in this group may eventually outstrip that of the non-south Asians. Genetic factors are unlikely to explain such a rapid change, implying an influence of environmental factors in disease aetiology “

GENETIC EPIDEMIOLOGY Flow of research Disease characteristics: Familial clustering: Genetic or environmental: Mode of

GENETIC EPIDEMIOLOGY Flow of research Disease characteristics: Familial clustering: Genetic or environmental: Mode of inheritance: Disease susceptibility loci: Disease susceptibility markers: Descriptive epidemiology Family aggregation studies Twin/adoption/half-sibling/migrant studies Segregation analysis Linkage analysis Association studies

(www)

(www)

Washington University (www)

Washington University (www)

Modes of inheritance

Modes of inheritance

GENETIC EPIDEMIOLOGY Flow of research Disease characteristics: Familial clustering: Genetic or environmental: Mode of

GENETIC EPIDEMIOLOGY Flow of research Disease characteristics: Familial clustering: Genetic or environmental: Mode of inheritance: Disease susceptibility loci: Disease susceptibility markers: Descriptive epidemiology Family aggregation studies Twin/adoption/half-sibling/migrant studies Segregation analysis Linkage analysis Association studies

(www)

(www)

ROCHE Genetic Education (www)

ROCHE Genetic Education (www)

Differences between linkage and association Linkage Association Linkage is a property of loci Association

Differences between linkage and association Linkage Association Linkage is a property of loci Association is a property of alleles Role: * To identify a biological mechanism for transmission of a trait * To locate the gene involved Role: * To identify association between an allelic variant and a disease * To identify linkage disequilibrium between a disease allele and a marker Coarse mapping (>1 c. M) Fine mapping (<1 c. M) No information about which allelic variant associated with higher risk of disease Require family pedigrees Case-control or family based approach Use very polymorphic markers Usually bi-allelic markers

Risch NJ. Nature 2000

Risch NJ. Nature 2000

GENETIC EPIDEMIOLOGY Flow of research Disease characteristics: Familial clustering: Genetic or environmental: Mode of

GENETIC EPIDEMIOLOGY Flow of research Disease characteristics: Familial clustering: Genetic or environmental: Mode of inheritance: Disease susceptibility loci: Disease susceptibility markers: Descriptive epidemiology Family aggregation studies Twin/adoption/half-sibling/migrant studies Segregation analysis Linkage analysis Association studies

Association Studies Population-based Cases and unrelated population controls from the same study base Family-based

Association Studies Population-based Cases and unrelated population controls from the same study base Family-based Child-family trios and TDT design is the most common

Odds Ratio: 3. 6 95% CI = 1. 3 to 10. 4 ROCHE Genetic

Odds Ratio: 3. 6 95% CI = 1. 3 to 10. 4 ROCHE Genetic Education (www)

Genetic Models and Case-Control Association Data Analysis The data may also be analysed assuming

Genetic Models and Case-Control Association Data Analysis The data may also be analysed assuming a prespecified genetic model. For example, with the hypothesis that carrying allele B increased risk of disease (dominant model), the AB and BB genotypes are pooled giving a 2 x 3 x 2 table. This is particularly relevant when allele B is rare, with few BB observations in cases and controls. Alternatively, under a recessive model for allele B, cells AA and AB would be pooled. Analysing by alleles provides an alternative perspective for case control data. This breaks down genotypes to compare the total number of A and B alleles in cases and controls, regardless of the genotypes from which these alleles are constructed. This analysis is counter-intuitive, since alleles do not act independently, but it provides the most powerful method of testing under a multiplicative genetic model, where risk of developing a disease increases by a factor r for each B allele carried: risk r for genotype AB and r 2 for genotype BB. If a multiplicative genetic model is appropriate, both case and control genotypes will be in Hardy–Weinberg equilibrium, and this can be tested for. A fourth possible genetic model is additive, with an increased disease risk of r for AB genotypes, and 2 r for BB genotypes. This model shows a clear trend of an increased number of AB and BB genotypes, with the risk for AB genotypes approximately half that for BB genotypes. The additive genetic model can be tested for using Armitage’s test for trend. Lewis CM. Brief Bioinform 2002 (www)

ROCHE Genetic Education (www)

ROCHE Genetic Education (www)

Linkage disequilibrium and population demography Mapping disease genes by association requires the identification of

Linkage disequilibrium and population demography Mapping disease genes by association requires the identification of linkage disequilibrium (LD) between a marker and a disease phenotype. Several studies of African populations have indicated that levels and patterns of LD in these populations differ from those in non-African populations owing to the age of African populations, admixture with other African and non-African populations, and historical differences in population size and substructure. A disease mutation (shown in violet) that occurs on a single haplotype background will initially be in complete LD with flanking markers on that chromosome (see panel a). In each generation, LD between a marker and a disease allele decays owing to recombination between the sites, and also because of the effects of mutation and gene conversion at marker loci. Young populations, and those that have undergone recent bottlenecks (as probably occurred during the migration of ancestral humans out of Africa), will have haplotype blocks of large to moderate size (panel b, shown in green). In older and larger African populations, in which there has been more recombination, the size of haplotype blocks will probably be smaller (panel c). LD can also be established by a founder event, with the strength and extent of the LD depending on the severity and length of the bottleneck event. Population substructure increases LD owing to a smaller effective population size and to higher levels of genetic drift in subdivided populations. So, if a pooled sample derived from several African populations was analysed, spurious LD would be detected, even if the haplotypes in each subpopulation were in LD. This could lead to erroneous conclusions about the association between genetic markers and disease phenotype. Small populations of stable size are expected to show LD between closely linked loci as a result of increased genetic drift, and larger populations will have fewer sites in LD. New mutations are less likely to be in LD in growing populations owing to the smaller effect of genetic drift, but allelic associations that exist before population expansion might persist for a longer period of time in an expanding population than in a population of constant size. Tishkoff, Nat Reviews Genet 2002 (www)

Mapping Disease Susceptibility Genes by Association Studies (www)

Mapping Disease Susceptibility Genes by Association Studies (www)

Mapping Disease Susceptibility Genes by Association Studies Plot of minus log of P value

Mapping Disease Susceptibility Genes by Association Studies Plot of minus log of P value for case-control test for allelic association with AD, for SNPs immediately surrounding APOE (<100 kb) Martin, 2000 (www)

Carlson, 2004 (www)

Carlson, 2004 (www)

Sample size requirements for different genetic models Palmer & Cardon, Lancet 2005 (www)

Sample size requirements for different genetic models Palmer & Cardon, Lancet 2005 (www)

Sample size requirements as a function of allele frequencies Johnson GC et al. Nat

Sample size requirements as a function of allele frequencies Johnson GC et al. Nat Genet 2001 (www)

Sample size requirements as a function of the strength of association Botstein & Risch.

Sample size requirements as a function of the strength of association Botstein & Risch. Nat Genet 2003 (www)

SNP Selection for Association Studies - Regulatory / Functional SNPs - (www) Fast. SNP

SNP Selection for Association Studies - Regulatory / Functional SNPs - (www) Fast. SNP (www) Yuan, 2006 (www)

SNP Selection for Association Studies - Regulatory / Functional SNPs - (www) Yue, 2006

SNP Selection for Association Studies - Regulatory / Functional SNPs - (www) Yue, 2006 (www)

SNP Selection for Association Studies - Haplotype Tagging SNPs - (www)

SNP Selection for Association Studies - Haplotype Tagging SNPs - (www)

Haplotype Association Tabor HK et al. Nature Rev Genetics 2002 (www)

Haplotype Association Tabor HK et al. Nature Rev Genetics 2002 (www)

Illustration of tagging SNPs a | The diagram shows five haplotypes. Twelve single nucleotide

Illustration of tagging SNPs a | The diagram shows five haplotypes. Twelve single nucleotide polymorphisms (SNPs) are localized in order along the chromosome. The letters on the top indicate groups of SNPs that have perfect pairwise linkage disequilibrium (LD) with one another, and the numbers on the bottom indicate each of the 12 SNPs. SNP 9 is the causal variant, which in this simple example determines drug response: allele C results in a therapeutic response, whereas allele G results in an adverse reaction. In this example, the selection of just one SNP from each of the groups A –E would be sufficient to fully represent all of the haplotype diversity. Each haplotype can be identified by just five tagging SNPs (t. SNPs), and the causal variant would be tagged even if it were not itself typed (in fact, multi-marker approaches to t. SNP selection would reduce the set of tags to fewer than five, but this is ignored for simplicity). So, t. SNP profiles that are highlighted predict an adverse reaction to the medicine. Normally, LD patterns are not so clear-cut and statistical methods are required to select appropriate sets of t. SNPs. b | The diagram depicts the same 12 SNPs, but with different associations among them, as might happen in a different population group. Because patterns of LD are different, some patients would be misclassified if the same five t. SNPs were used and interpreted in the same way; that is, using the same SNP profiles as defined in population A, haplotype profiles 1, 2 and 3 are predicted to have allele C at the causal SNP 9 (a therapeutic response), whereas haplotype profiles 4 and 5 are predicted to have an adverse response. However, because the pattern of association has changed, the new haplotypes 6 and 7 are misclassified as haplotype patterns 6 and 7 in population B. Goldstein, Nat Rev Genet 2003 (www)

Erichsen & Chanock. Br J Cancer 2004 (www)

Erichsen & Chanock. Br J Cancer 2004 (www)

Associations with Ancestral Haplotypes (Schork, 1998)

Associations with Ancestral Haplotypes (Schork, 1998)

Dorak, 2002 (www) Ayala, 1994 (www)

Dorak, 2002 (www) Ayala, 1994 (www)

Palmer LJ. Webcast (www)

Palmer LJ. Webcast (www)

Wacholder, 2002 (www)

Wacholder, 2002 (www)

Population Stratification Marchini, 2004 (www)

Population Stratification Marchini, 2004 (www)

Population Stratification Cardon & Palmer, 2003 (www)

Population Stratification Cardon & Palmer, 2003 (www)

Multiple Comparisons & Spurious Associations Diepstra, Lancet 2005 (www)

Multiple Comparisons & Spurious Associations Diepstra, Lancet 2005 (www)

(www)

(www)

Family-based association study designs * Haplotype Relative Risk (HRR) Method (Falk & Rubinstein, 1987;

Family-based association study designs * Haplotype Relative Risk (HRR) Method (Falk & Rubinstein, 1987; Knapp, 1993) * Affected Family-Based Controls (AFBAC) Method (Thomson, 1995) * Transmission Disequilibrium/Distortion Test (TDT) (Spielman, 1993 & 1994; Ewens & Spielman, 1995) Reviews: (Thomson, 1995; Gauderman, 1999)

Parent-Case Trios in TDT/HRR “Non-transmitted allele” “control” □ ○ BC AB □ ○ AB

Parent-Case Trios in TDT/HRR “Non-transmitted allele” “control” □ ○ BC AB □ ○ AB CD □ ○ AC BD ● ● ■ BB BC AB “transmitted allele“ “case” □○ ■ AC BB BC

- AN EXAMPLE OF TDT TRANSMISSION DISEQUILIBRIUM OF HLA-B 62 TO THE PATIENTS WITH

- AN EXAMPLE OF TDT TRANSMISSION DISEQUILIBRIUM OF HLA-B 62 TO THE PATIENTS WITH CHILDHOOD AML (Dorak et al, BSHI 2002) Nontransmitted Allele Transmitted Allele B 62 Other B 62 x 12 Other 1 y Out of 13 parents heterozygote for B 62, 12 transmitted B 62 to the affected child and 1 did not Mc Nemar’s test results: P = 0. 006 (with continuity correction) odds ratio = 12. 0, 95% CI = 1. 8 to 513

Multifactorial Etiology ROCHE Genetic Education (www)

Multifactorial Etiology ROCHE Genetic Education (www)

Models of gene–environment interactions Hunter, 2005 (www)

Models of gene–environment interactions Hunter, 2005 (www)

Sample size requirement for gene-environment interaction studies Hunter, 2005 (www)

Sample size requirement for gene-environment interaction studies Hunter, 2005 (www)

An example of a gene-environment interaction In Alzheimer disease, the risk of cognitive decline

An example of a gene-environment interaction In Alzheimer disease, the risk of cognitive decline as measured by TICS test is particularly high in APOE 4 carriers who have untreated hypertension (APOE 4+/HT+). Hunter, 2005 (www)

Falconer's polygenic threshold model for dichotomous nonmendelian characters: Liability to the condition is polygenic

Falconer's polygenic threshold model for dichotomous nonmendelian characters: Liability to the condition is polygenic and normally distributed (upper curve). People whose liability is above a certain threshold value are affected. Their sibs (lower curve) have a higher average liability than the population mean and a greater proportion of them have liability exceeding the threshold. Therefore the condition tends to run in families (Falconer DS, 1967).

M. Tevfik DORAK http: //www. dorak. info

M. Tevfik DORAK http: //www. dorak. info