Gene Hunting Linkage and Association We humans are



































- Slides: 35
Gene Hunting: Linkage and Association We humans are diploid (i. e. , we have two copies of a gene), inheriting one chromosome from mother, the other from father. In transmitting a chromosome to an offspring, however, the physical process of recombination (crossing over) results in a chromosome that contains part of the maternal chromosome and part of the paternal chromosome. Recombination also makes possible a number of different analytical strategies in genetics: linkage, ancestry tracing, and some forms of association. Key terms: polymorphism, recombination, crossing over, linkage analysis, association design, haplotype, linkage equilibrium/disequilibrium, GWAS (genome-wide association study).
Recombination (Crossing Over) In meiosis, homologous chromosomes join together at a section and exchange genetic material. Homologous chromosomes: chromosomes with the same genes on them. E. g. , your paternal chromosome number 1 and your maternal chromosome number 1.
Recombination: Linkage Analysis Population Haplotype Analysis Ancestry Tracing Association Genome-wide Association (Current “Hot” Technology)
Example: A a b B C c d D c C D d
Key Point about Recombination: Recombination is a function of physical distance. • If two alleles are separated by 8 nucleotides, then there are “ 8 chances” of a recombination event between the two. . • If two alleles are separated by 257 nucleotides, then are “ 257 chances” of a recombination event between the two. • Therefore, alleles on the same DNA strand that are far away are more likely to be broken up by recombination than alleles that are close together.
Original Chromosomes: Allele 1 Allele 2 New Chromosomes: Dad Mom Pair Up A T C G G C T A G C C T G A C A T T G G C T A G C C T G A C G A T A A T T C T G G C CC C T TT T A AA A G G C C T T G G A A C C A G T A T T Exchange Material A T C G G C T A G C C T G A C G A T T G G C T A G C C T G A C A T T
A T C G G C T A G C C T G A C A T T G C A T C 3 chances G G C 10 chances T A G C C T G A C A T T G C 17 chances
In other words: Alleles close together on the same DNA strand (i. e. , the same chromosome) tend to be transmitted as a unit. Alleles far away on the same DNA strand tend to be broken up.
Definitions: Linkage: Biological phenomenon that close to one another tyend to transmitted as a unit. Linkage Analysis: (1) tracing the co-segregation of (2) one or more marker genes with a trait gene within pedigrees (3) within families
Definitions: Trait gene: A gene that contributes to the trait of interest, e. g. , schizophrenia. Marker Gene: A polymorphic “gene” with that does not contribute to the trait but has a known location in the genome.
Rationale for Linkage Analysis Can I predict who gets the disorder (trait) by knowing the marker genes in a family? YES: A trait gene is close to a marker. NO: No trait genes are close to the marker.
Linkage Analysis A D Father’s chromosomes are aa Aa Aa a d Aa aa
Linkage Analysis A a D d Father’s chromosomes are aa Aa Aa aa
Aa I. 1 AA Aa Aa II. 1 II. 2 II. 3 Aa AA Aa III. 1 III. 2 III. 3 aa aa I. 2 aa Aa Aa II. 4 II. 5 II. 6 II. 7 aa aa Aa Aa III. 4 III. 5 III. 6 III. 7 aa II. 8 aa III. 8
Aa aa I. 1 I. 2 D d d d A A Aa A a aa aa A a Aa a a II. 1 II. 2 II. 3 II. 4 II. 5 II. 6 II. 7 II. 8 A a AA A a aa a a Aa A a aa III. 1 III. 2 III. 3 III. 4 III. 5 III. 6 III. 7 III. 8 d d D d d d d d D d d d
Haplotype Series of alleles along a short section of the same strand of DNA Strand: Haplotype: ATCTGCCTCGCCATAAAGTCATTCGCTCAT ATCTGCCTCGCCATAAAGTCATTCGCTGAT ATCAGCCTCGCCATAAAGTCATTCGCTCAT ATCAGCCTCGCCATAAAGTCATTCGCTGAT position 4: position 28: T A C G allele TC TG AC AG
Linkage Equilibrium & Disequilibrium If I know the first allele in a haplotype, can I predict the second allele? Yes No Linkage Disequilibrium Linkage Equilibrium
Linkage Equilibrium & Disequilibrium In other words: Equilibrium: Frequency of a haplotype is due to chance. Disequilibrium: Frequency of a haplotype differs from chance frequency.
Haplotype (Graduate) Chance: If the frequency of allele T is. 2 and the frequency of allele C is. 4, then the frequency of haplotype TC is. 2*. 4 =. 08. Nonchance: If the frequency of allele T is. 2 and the frequency of allele C is. 4, then the frequency of haplotype TC is significantly different from. 08.
Haplotype (Graduate) Position 28: C G T TC TG A AC AG Position 4:
Equilibrium (Graduate) Position 28: C G T . 08. 12 . 2 A . 32. 48 . 8 Position 4: . 4 . 6
Disequilibrium (Graduate) Position 28: C G T . 16. 04 . 2 A . 14. 56 . 8 Position 4: . 4 . 6
Statistics for Equilibrium (Graduate) Position 28: C G T X 11 X 12 p 1 A X 21 X 22 q 1 Position 4: p 2 q 2
Statistics for Equilibrium (Graduate) d = X 11 X 22 - X 12 X 21= cov(L 1, L 2) D = X 11 - p 1 p 2 = X 22 - q 1 q 2 If D > 0, D = D/Dmax where Dmax = min(p 1 q 2, p 2 q 1) If D > 0, D = D/Dmax where Dmax = min(p 1 p 2, q 1 q 2) R 2 = d 2 / (p 1 p 2 q 1 q 2) D and R 2 are the most often used stats.
Formation of Disequilibrium (Graduate) 1. Mutation occurs and creates a new spelling variation (polymorphism). 2. This creates linkage disequilibrium with those polymorphisms along the same DNA strand with the mutation. 3. Over generations, recombination will break up the disequilibrium with polymorphisms that are far away from the mutation. 4. Polymorphisms close to the original mutation, however, will remain in disequilibrium for a longer time. 5. Hence, polymorphisms close to the mutation will be in disequilibrium longer than polymorphisms farther away from the mutation.
Disequilibrium: 1. Is the norm rather than the exception for short sections of DNA (100, 000 nucleotides). 2. Generates “haplotype blocks” (see next slide). 3. Haplotype Mapping Project (Hap. Map): provide a map of the haplotype blocks for the human genome. 4. Allows genome-wide association studies.
Haplotype Blocks: Section of DNA (vertical bar = polymorphism): Block 1 Block 2 Block 7 • Haplotype Block: Series of adjacent alleles in strong disequilibrium. • Logic: Instead of genotyping all 37 polymorphisms, genotype one in each block. • If there is a “hit, ” then go back and genotype the other polymorphisms in that block.
Haplotype block structure of the cytochrome P 450 CYP 2 C gene cluster on chromosome 10. From Walton et al. (2005), Nature Genetics 37, 915 -0916.
Association Design • Begins with KNOWN polymorphism theoretically expected to be associated with the trait (e. g. , DRD 2 and schizophrenia). • Genotypes people on the gene and phenotypes them on the trait. • Tests whether the genotype is associated with the trait. • Two types: (1) Population-based (controls = general pop) (2) Family-based (controls = genetic relatives)
Population-based Association Design Genotype: AA Aa Schiz: Phenotype: Not Schiz: Do c 2 test for association. aa
Genome-wide Association Study (GWAS) (1) Genotype one locus per haplotype block (2) Do an association test for every gene. (3) Number of genes that can be assayed changes from year to year.
GWAS: Genome-wide Association Study 1. DNA arrays with 1, 000 s of SNPs scattered throughout the genome. (Current chips in 2009 has 1, 000 different SNPs) 2. Select the SNPs so that they cover ALL the genome. (Some DNA chips concentrate on known protein coding regions rather than trying to cover all the genome) 3. Genotype patients and controls on all the SNPs. 4. Find the SNPs that differ. 5. Problem: number of statistical tests.
Problems with GWAS (1) Expensive. (2) Large number of statistical tests. (3) Need very, very large samples (10, 000 or more.
Results from GWAS (1) Good success in medicine. (2) Limited success for psychiatric disorders (3) Virtually no success for normal behavioral traits (personality, IQ) (4) Genetics of behavior is hyper-polygenic: many, many genes