DNA Properties CSE Marmara University mimoza marmara edu

  • Slides: 16
Download presentation
DNA Properties CSE, Marmara University mimoza. marmara. edu. tr/~m. sakalli/cse 546 Oct/19/09 Computational Molecular

DNA Properties CSE, Marmara University mimoza. marmara. edu. tr/~m. sakalli/cse 546 Oct/19/09 Computational Molecular Biology Bioinformatics Genomics Proteomics Functional genomics Structural bioinformatics

No simple definition of being alive!! (life). . Reproducing itself, a default mechanism for

No simple definition of being alive!! (life). . Reproducing itself, a default mechanism for every alive being How about computer programs, crystals, and self building and self learning robotics and computers. . Life on earth is a result of an evolutionary process, and idea is that all living things have a common ancestor and are related through… Basic components of evolution: Inheritance Variation: defined legal moves in genotype space. Selection: a probabilistic evaluation function In Computer Science: DNA is a string of symbols from alphabet {A, C, G, T} A search through a very large space of possible organism characteristics. And the words built from the four letter alphabet covers all the inherited characteristics (called the genotype) of all the organisms.

The Central Dogma in molecular biology http: //proquestcombo. safaribooksonline. com/0596002998/blast-CHP-2 3 processes: Replication, Transcription,

The Central Dogma in molecular biology http: //proquestcombo. safaribooksonline. com/0596002998/blast-CHP-2 3 processes: Replication, Transcription, and Translation. Every cell in our body has 23 chromosomes in the nucleus and the genes in these chromosomes are responsible for almost all of the characteristics (not merely a physical).

http: //www. ncbi. nlm. nih. gov/books/bv. fcgi? rid=mboc 4. figgrp. 600, by Bruce Alberts,

http: //www. ncbi. nlm. nih. gov/books/bv. fcgi? rid=mboc 4. figgrp. 600, by Bruce Alberts, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, and Peter Walter Figure 4 -5. The DNA double helix. (A) A space-filling model of 1. 5 turns of the DNA double helix. Each turn of DNA is made up of 10. 4 nucleotide pairs and the center-to-center distance between adjacent nucleotide pairs is 3. 4 nm. The coiling of the two strands around each other creates two grooves in the double helix. As indicated in the figure, the wider groove is called the major groove, and the smaller the minor groove. (B) A short section of the double helix viewed from its side, showing four base pairs. The nucleotides are linked together covalently by phosphodiester bonds through the 3 -hydroxyl (-OH) group of one sugar and the 5 -phosphate (P) of the next. Thus, each polynucleotide strand has a chemical polarity; that is, its two ends are chemically different. The 3 end carries an unlinked -OH group attached to the 3 position on the sugar ring; the 5 end carries a free phosphate group attached to the 5 position on the sugar ring.

DNA structure and base pairing Polymer of: Ribose sugar Phosphate Nitrogenous base Bases A,

DNA structure and base pairing Polymer of: Ribose sugar Phosphate Nitrogenous base Bases A, C, G, T and Uracil Pairing rule A (R) — T (Y) G (R) — C (Y) Pu. Rine, Pyrimidine Why double-stranded! Chemically and biophysically more stable!!, allows some error correction (backup) if accidentally damaged—UV irradiation--.

RNA - Translation Genes (less than 5% of all), providing the coding information. Instructions

RNA - Translation Genes (less than 5% of all), providing the coding information. Instructions for protein synthesis, regulatory functions. . Redundancy translates to robustness!! Synonymous codons Dual strands Diploid In translation the information now encoded in RNA is deciphered (translated) into instructions for making a protein. Codon: Sets of three nucleotides. Codon determines which amino acid to be added next in the protein chain. For example, GCU, the first codon in the figure, codes for alanine.

The table of the nucleotide triplets (codons) and their corresponding aa. a uracil (U)

The table of the nucleotide triplets (codons) and their corresponding aa. a uracil (U) is substituted for a thymine (T). This is Universal process. . The RNA alphabet is A, C, G, and U, GAAUUC the third position of a codon is often insignificant ATG: Start codon protein (methionine) T in the middle hydrophobic aa. 64 possible codons but 20 total aa, start and stop kind of!!. . Or regulatory functions. Second nt position, U, C, A, G 3 r d n t p o s i ti o n , U , C , A , G 1 st nt position, U, C, A, G

SNP, single nucleotide polymorphism, wobbling in the code, neutral synonymous mutations. Some changes at

SNP, single nucleotide polymorphism, wobbling in the code, neutral synonymous mutations. Some changes at every third of the DNA sequence, for example a point mutation such as that shown below, will not yield any variation of the amino acid sequence and nor the protein produced, for example alanine is produced in either case of a U to a C, therefore a point mutation from U to C would make no difference. GCUAGGAUCUCAGGCUCA Point mutation GCCAGGAUCUCAGGCUCA Protein coding sequences are called exons. The redundant parts are introns, intervening DNA segments. Both introns and exons are transcribed into m. RNA (see next slide) but only exons remain in the final transcript. Frameshift of the sequence: 6 possible reading. Therefore it is important to know which codon to start translation with, and where to stop. http: //en. wikipedia. org/wiki/Gene

A protein-coding region framed with Met (ATG) and any stop codon is (called an

A protein-coding region framed with Met (ATG) and any stop codon is (called an open reading frame). TAA, TAG, or TGA. An example of an ORF. Splicing of DNA to eliminate introns …. TCGAATGGCATTCGCAGTC…………. . T ACTTGCACGCTTGACCGTCATAAGCA…. In addition, each of the 20 aa’s have different chemical properties which cause the protein chains to form different 3 D shapes, and differentiate their particular functions in the cell. For example, certain folding patterns (called tertiary structures) make it possible for specific enzymes to bind in a particular place. One change in the DNA sequence could change the amino acid, which could change the protein structure…. And the enzymes. . A Science Primer http: //www. ncbi. nlm. nih. gov/About/primer/est. html

Levels and types of genome variations Plant genomes may differ from one another in

Levels and types of genome variations Plant genomes may differ from one another in different ways: http: //www. igd. cornell. edu/Comparative%20 Genomics 1. 2. 3. Amount of DNA in the nucleus. Quantified in picogrms, (also called C-value), varies over 1000 -fold. Number and size of chromosomes. Differences at the sequence level, both in the |absolute order| of the bases, and in the type and number of different classes of sequences. Organisms originated millions of years ago, from the same sequence should be sharing the same sequential structures, family-tree, phylogeny. Some of the mechanisms of genetic variations: § Point mutations § Insertions and deletions § Translocations § § Transposons, (mobile) jumping genes, retrotransposons copying themselves from RNA back to DNA – reverse transcriptase, Splicing, transcription and translation errors

DNA: contains non-genic material RNA: unstable c. DNA: stable and mainly genes Finding genes:

DNA: contains non-genic material RNA: unstable c. DNA: stable and mainly genes Finding genes: c. DNA: The genetic sequence could be analyzed from the DNA, but it has too much non-genetic junk materials, jut studying m. RNA, however, m. RNA and protein are very unstable and therefore difficult to work with. Instead, scientists use special enzymes to convert RNA into complementary DNA (c. DNA) which is a much more stable compound and because it was generated from a m. RNA in which the introns have been removed, c. DNA represents only transcribed DNA sequence, the genes. Genetic Mapping: Used for linkage mapping, and uses the concepts of Mendelian inheritance and recombination frequencies to determine the chromosomal location by analyzing their inherited patterns. Done by either Southern blot (electrophoresis separated fragments subsequently detected by probe hybridization) and, more recently polymerase chain reaction - PCR (using thermal cycling) based methods. A tomato F 2 population used to calculate recombination frequencies, and genetic distances, between a selection of SSRs simple sequence repeat (microsatellites) SSRs and other molecular markers.

Comparative mapping: Among related but sexually incompatible species, heterologous (between species) DNA markers can

Comparative mapping: Among related but sexually incompatible species, heterologous (between species) DNA markers can be used to generate comparative maps and to infer linkage conservation and the position of orthologous (if branched from the homologous) loci. This requires a minimal amount of similarity between the target and probe species and so cannot be used with more distantly related species. Most gramineae genomes (i. e. grass species, maize, rice, wheat, barley, millet, etc) are connected through comparative genetic maps. While genome size varies dramatically among grass species, but the gene content and gene order remain more highly conserved. .

Packing of DNA in the nucleus http: //employees. csbsju. edu/hjakubowski/classes/ch 331/DNA/oldnastructure. html

Packing of DNA in the nucleus http: //employees. csbsju. edu/hjakubowski/classes/ch 331/DNA/oldnastructure. html

Archebacterium living in a superheated sulphur vent at the bottom of the ocean A

Archebacterium living in a superheated sulphur vent at the bottom of the ocean A two-ton polar bear roaming the arctic circle Genome size (length of DNA) varies from 5, 000 (SV 40 virus) to 3*109 (humans) 1011 (higher plants) All organism share basic properties Made of cells (membrane-enclosed sacks of chemicals) Carry basic reactions (e. g. core metabolic and developmental pathways) Figure 1 -38. Genome sizes compared. Genome size is measured in nucleotide pairs of DNA per haploid genome, that is, per single copy of the genome. (The cells of sexually reproducing organisms such as ourselves are generally diploid: they contain two copies of the genome, one inherited from the mother, the other from the father. ) Closely related organisms can vary widely in the quantity of DNA in their genomes, even though they contain similar numbers of functionally distinct genes. (Data from W. -H. Li, Molecular Evolution, pp. 380 383. Sunderland, MA: Sinauer, 1997. )

Tree of Life Three major groups: Archaea (recently discovered) Bacteria (germs, algae, symbiotic organisms)

Tree of Life Three major groups: Archaea (recently discovered) Bacteria (germs, algae, symbiotic organisms) Eukaryotes Animals Green Plants Fungi Protists Viruses Figure 1 -21. The three major divisions (domains) of the living world. Note that traditionally the word bacteria has been used to refer to procaryotes in general, but more recently has been redefined to refer to eubacteria specifically. Where there might be ambiguity, we use the term eubacteria when the narrow meaning is intended. The tree is based on comparisons of the nucleotide sequence of a ribosomal RNA subunit in the different species. The lengths of the lines represent the numbers of evolutionary changes that have occurred in this molecule in each lineage.