I Week 1 Course Overview Basic Knowledge of
課程綱要 I • Week 1: Course Overview, Basic Knowledge of Genome Biology, Basic Principles of Population Genetics • • • Week 2: Linkage Analysis for Family Data – I Week 3: Linkage Analysis for Family Data – II Week 4: Introduction to Microarray Data Analysis Week 5: Nature of Discrete Genetic data & Estimating Frequencies Week 6: Disequilibrium & Diversity Week 7: Population Structure, Individual Identification & Outcrossing And Selection • Week 8: Linkage • Week 9: Midterm
課程綱要 II • Week 10. Phylogeny Reconstruction & Quantitative Genetics I • Week 11: Quantitative Genetics II • Week 12: QTL mapping I • Week 13: QTL mapping II • Week 14: Population-based Association Analysis • Week 15: Family-based Association Analysis • Week 16: Multipoint Association Analysis • Week 17: Genomewide Association Analysis
Thomas Andrew Knight (1759 -1838) Thomas Andrew Knight, the first man to practice large-scale, systematic strawberry breeding, which produced two famous varieties: the Downton and the Elton. As a founder and long-time president of England's Royal Horticultural Society, he encouraged others to breed better varieties of fruits and vegetables.
Thomas Andrew Knight • Knight's father was a Herefordshire clergyman who died when his son was five years old. The boy's education was neglected, and until he was nine he remained almost illiterate. Since he was unable to read as a child, he concentrated his curiosity on the plant and animal life on the family estate. One day, says a story, he saw a gardener planting beans. The boy asked why the man was planting sticks of wood and was told they would grow up to be beans. The gardener's prediction came true. Knight immediately planted his pocket knife and waited in anticipation for the miraculous growth of new knives. When the experiment failed he sat down to consider the difference in the two cases. Already he was engrossed with the mysteries of the vital processes in plants, a preoccupation which would lead later to his reputation as a brilliant plant physiologist.
Downton (1817) Elton (1828)
Knight didn’t count, Mendel did count.
Gregor Mendel 1822 -1884 By the 1890's, the invention of better microscopes allowed biologists to discover the basic facts of cell division and sexual reproduction. The focus of genetics research then shifted to understanding what really happens in the transmission of hereditary traits from parents to children. A number of hypotheses were suggested to explain heredity, but Gregor Mendel, a little known Central European monk, was the only one who got it more or less right. His ideas had been published in 1866 but largely went unrecognized until 1900, which was long after his death. His early adult life was spent in relative obscurity doing basic genetics research and teaching high school mathematics, physics, and Greek in Brno (now in the Czech Republic). In his later years, he became the abbot of his monastery and put aside his scientific work.
James Watson Francis Crick 1928 -- 1916 --2004
Slides 15— 36 are edited from
and Bonnie Berger MIT
The human genome • The cell is the fundamental working unit of every living organism. • Humans: trillions of cells (metazoa); other organisms like yeast: one cell (protozoa). • Cells are of many different types (e. g. blood, skin, nerve cells), but all can be traced back to a single cell, the fertilized egg.
Nucleus
Eukaryota: More on Morphology
The human genome in numbers • • • 23 pairs of chromosomes; 2 meters of DNA; 3, 000, 000 bp; 35 M (males 27 M, females 44 M); 30, 000 -40, 000 genes.
The human genome • The genome, or blueprint for all cellular structures and activities in our body, is encoded in DNA molecules. • Each cell contains a complete copy of the organism’s genome.
The human genome • The human genome is distributed along 23 pairs of chromosomes 22 autosomal pairs; the sex chromosome pair, XX for females and XY for males. • In each pair, one chromosome is paternally inherited, the other maternally inherited (cf. meiosis).
The human genome • Chromosomes are made of compressed and entwined DNA. • A (protein-coding) gene is a segment of chromosomal DNA that directs the synthesis of a protein.
DNA • A deoxyribonucleic acid or DNA molecule is a double-stranded polymer composed of four basic molecular units called nucleotides. • Each nucleotide comprises a phosphate group, a deoxyribose sugar, and one of four nitrogen bases: adenine (A), guanine (G), Cytosine (C), and thymine (T) • The two chains are held together by hydrogen bonds between nitrogen bases. • Base-pairing occurs according to the following rule: G pairs with C, and A pairs with T.
Genes control the making of cell parts • The gene is a fundamental unit of inheritance – DNA molecule contains tens of thousands of genes – Each gene governs the making of one functional element, one “part” of the cell machine – Every time a “part” must be made, a piece of the genome is copied, transported, and used as a blueprint • RNA is a temporary copy – The medium for transporting genetic information from the DNA information repository to the protein-making machinery is and RNA molecule – The more parts are needed, the more copies are made – Each m. RNA only lasts a limited time before degradation
The genetic code • DNA: sequence of four different nucleotides. • Protein: sequence of twenty different amino acids. • The correspondence between DNA’s four-letter alphabet and a protein’s twenty-letter alphabet is specified by the genetic code, which relates nucleotide triplets or codons to amino acids.
Big Picture
Basic human genetics • 46 chromosomes 22 pairs of autosomal chromosomes and 2 sex chromosomes Double stranded DNA 4 bases: A = Adenine p-arm q-arm Centromere T = Thymine G = Guanine C = Cytosine Approximately 3 000 000 basepairs in the human genome
The Central Dogma of Molecular Biology
Basic Principles of Population Genetics Reference: Kenneth Lange Mathematical and Statistical Methods for Genetic Analysis
Mendel’s experiment data Trait Characteristics Dominant Recessive tall short 787 277 pod shape inflated constric ted 882 299 seed shape round wrinkled 5474 1850 seed colour yellow green 2001 flower position axial terminal 651 207 flower colour purple white 705 224 pod colour green yellow 428 152 stem length 6022
Mendel’s First Law • First Generation RR x rr • Second Generation Rr x Rr (self cross) • Third generation RR+Rr (3/4) rr (1/4)
Mendel’s Second Law Independent two traits
What if the traits are not independent?
Genetic and physical maps • Physical distance: number of base pairs (bp). • Genetic distance: expected number of crossovers between two loci, per chromatid, per meiosis. Measured in Morgans (M) or centi. Morgans (c. M). • 1 c. M ~ 1 million bp (1 Mb).
Definition • The genetic map distance (in units of Morgans) between two loci is defined as the expected (average) number of crossovers occuring on a single chromosome (in a gamete) between two loci. Ex: Chromosome 1: Note: 1 Mb. 1 c. M Physical length: 263 Mb Female map length: 3. 76 M = 376 c. M Male map length: 2. 21 M = 221 c. M
Crossover, Recombination Mother’s Chromosomes Father’s Chromosomes Sibling 1 Crossover Recombination: crossover occurs odd number of times Haldane Mapping Fun. : A Recombination freq. Fun between 2 genes Q(d)=(1 -exp(-2 ld))/2 Assume that the event of Crossover across a Chromosome is a Poisson Process
Haldane Mapping Function • Assume crossover happens as a Poisson Process along the chromosome rate: physical distance: d
Haldane Mapping Function • = P( Recombination between A and B) = P( # of crossover {odd number} between A, B) =
Haldane Mapping Function
• The following 5 slides are to help you keep a reference for the basic human genetics terminologies.
1. 2 Genetics Background The cells of all organisms, from bacteria to humans, contain one or more sets of a basic DNA complement that is unique to the species. This fundamental complement of DNA is called a genome. The genome may be subdivided into chromosomes, each of which is a very long single continuous DNA molecule. In its turn, a chromosome can be demarcated along its length into thousands of functional regions called genes. The word gene is used originally as the unit factor of heredity. In modern terminology, a gene is a specific coding sequence of DNA. The alternate forms of a gene are called alleles. Two persons who share alleles from a common ancestor are called Identical by Descent, abbreviated as IBD. The pair of alleles in an individual constitutes that individual’s genotype. The expression of a particular genotype is called a phenotype.
Sperm and egg are created in a process called meiosis by splitting the chromosome pairs in half and creating cells with only twenty-three single chromosomes. When an embryo is formed from an egg and a sperm cell, it again has a full set of twenty-three pairs, with half of each pair coming from mother and half from the father. In meiosis, homologous chromosomes pair up, and they may exchange genetic material between them during a process called crossover. A chromosome in a gamete, which is a mixture of the two homologous chromosomes in the parent, can be modeled in the
following way. It starts with either homologous chromosome randomly, moves a random distance along this chromosome and then switches to the other chromosome. It moves another random distance, and switches again. This process continues untill the end of the chromosome is reached.
There are two kinds of distance metric for chromosome. Physical distances are measured in terms of number of base pairs (abbreviated as bp) Between two points. The units for physical distances are bp and kb (1000 bp). Genetic distances are defined as the expected numbers of crossovers between two points with unit Morgan. Another common unit for genetic distances is c. M (centi-Morgan). Different models underlying the crossover process will give different genetic distances. The most popular one is Haldane model, saying that the random distance waiting for a crossover to occur is an exponential R. V. this implies that the number of crossovers along the chromosomes is a Poisson process. The genetic length of a a human genome is about 35 Morgans. See Ott (1991).
If two alleles on the same parental chromosomes are passed to the offspring together, one says that there is no recombination between them; otherwise, one says that there is recombination. Another way to explain recombination Is that there is odd number of crossovers between two genes. When two genes are inherited independently of each other, the probabilities for recombination and no recombination are equal, i. e. , ½. Two genes are linked if the recombination frequency between them is smaller than ½. (Notice that the recombination frequency is never greater than ½. ) A mapping function is a mapping between the recombination frequency and genetic distance for two loci. For example, under the Haldane model, the mapping function for
Hardy-Weinberg Equilibrium • The genotype frequencies reach steady states through the generations. • Assumptions: – – – – 1)Infinite population size 2)Discrete generations 3)Random Mating 4)No Selection 5)No migration 6) No mutation 7) Equal initial genotype frequencies in 2 sexes.
Hardy-Weinberg Equilibrium • Consider a single locus with two alleles (A, a), the possible genotypes are (AA, Aa, aa) • Question: How the genotype frequencies propagate through the generation? genotype freq. P 0 = P(A) = U 0+V 0 Q 0 = P(a) = W 0+V 0 = 1 - P 0
H. W. Equilibrium
HW Equilibrium for X-linked loci • Assume at generation n – gene frequency for female – gene frequency for male =>
HW Equilibrium for X-linked loci • Proof : Under the similar conditions, we have =>
HW Equilibrium for X-linked loci • a=2 • a = -1
Linkage Equilibrium alleles frequency : haplotype frequency of in the generation : recombination frequency(= ),
Linkage Equilibrium if
Selection: reproduction capacity • E. g. let (fitness) be the expected genetic contributions to the next generation for the given genotypes. W. L. O. G. let where
Selection • Let be the allele frequency of A at generation n. for allele a
Selection
Selection • To reach equilibrium state or 0 or Assume r, s different sign • if r > 0, if => extinction of A • if , s>0 => extinction of a
Selection • if r, s have the same sign
Selection • if r < 0, s < 0, unstable equilibrium
Selection • If r > 0, stable equilibrium
Heterozygote advantage (r, s both positive) • Geneticists have suggested that reverse recessive diseases are maintained at high frequency by the mechanism of Heterozygote advantage. • The best evidence favoring this hypothesis exists for sickle cell anemia. A single dose of the sickle cell gene appears to confer protection against malaria.
Sickle Cell Anemia normal hemoglobin Hb 2 alpha and 2 beta chains form a 4 chain tetramer
Sickle Cell Anemia beta chains bind with other beta chains in RBC when deoxygenated –polymerization occurs –Hb polymers distort RBC into sickled shapes –vaso-occlusion
- Slides: 78