Class 1 Introduction Source Alberts et al The
Class 1: Introduction .
Source: Alberts et al The Tree of Life
The Cell
Example: Tissues in Stomach
DNA Components Four nucleotide types: u Adenine u Guanine u Cytosine u Thymine Hydrogen bonds: u A-T u C-G
Source: Alberts et al The Double Helix
Source: Mathews & van Holde DNA Duplication
Source: Alberts et al DNA Organization
Genome Sizes u E. Coli (bacteria) u Yeast (simple fungi) u Smallest human chromosome u Entire human genome 4. 6 x 106 bases 15 x 106 bases 50 x 106 bases 3 x 109 bases
Genes The DNA strings include: u Coding regions (“genes”) l E. coli has ~4, 000 genes l Yeast has ~6, 000 genes l C. Elegans has ~13, 000 genes l Humans have ~32, 000 genes u Control regions l These typically are adjacent to the genes l They determine when a gene should be expressed u “Junk” DNA (unknown function)
Transcription sequences can be transcribed to RNA Source: Mathews & van Holde u Coding u RNA l l nucleotides: Similar to DNA, slightly different backbone Uracil (U) instead of Thymine (T)
RNA Editing
Source: Mathews & van Holde RNA Editing
RNA roles u Messenger RNA (m. RNA) l Encodes protein sequences u Transfer RNA (t. RNA) l Adaptor between m. RNA molecules and aminoacids (protein building blocks) u Ribosomal RNA (r. RNA) l Part of the ribosome, a machine for translating m. RNA to proteins u. . .
Transfer RNA Anticodon: u matches a codon (triplet of m. RNA nucleotides) Attachment site: u matches a specific amino-acid
Translation u Translation is mediated by the ribosome u Ribosome is a complex of protein & r. RNA molecules u The ribosome attaches to the m. RNA at a translation initiation site u Then ribosome moves along the m. RNA sequence and in the process constructs a poly-peptide u When the ribosome encounters a stop signal, it releases the m. RNA. The construct poly-peptide is released, and folds into a protein.
Source: Alberts et al Translation
Source: Alberts et al Translation
Source: Alberts et al Translation
Source: Alberts et al Translation
Source: Alberts et al Translation
Genetic Code
Protein Structure u Proteins are polypeptides of 70 -3000 amino-acids u This structure is (mostly) determined by the sequence of amino-acids that make up the protein
Protein Structure
Evolution u Related organisms have similar DNA l Similarity in sequences of proteins l Similarity in organization of genes along the chromosomes u Evolution plays a major role in biology l Many mechanisms are shared across a wide range of organisms l During the course of evolution existing components are adapted for new functions
Evolution of new organisms is driven by u Diversity l Different individuals carry different variants of the same basic blue print u Mutations l The DNA sequence can be changed due to single base changes, deletion/insertion of DNA segments, etc. u Selection bias
Course Goals u Computational u We tools in molecular biology will cover computational tasks that are posed by modern molecular biology u We will discuss the biological motivation and setup for these tasks u We will understand the kinds of solutions exist and what principles justify them
Four Aspects Biological l What is the task? Algorithmic l How to perform the task at hand efficiently? Learning l How to adapt parameters of the task form examples Statistics l How to differentiate true phenomena from artifacts
Example: Sequence Comparison Biological l Evolution preserves sequences, thus similar genes might have similar function Algorithmic l Consider all ways to “align” one sequence against another Learning l How do we define “similar” sequences? Use examples to define similarity Statistics l When we compare to ~106 sequences, what is a random match and what is true one
Topics I Dealing with DNA/Protein sequences: u Genome projects and how sequences are found u Finding similar sequences u Models of sequences: Hidden Markov Models u Transcription regulation u Protein Families u Gene finding
Topics II Gene Expression: u Genome-wide expression patterns u Data organization: clustering u Reconstructing transcription regulation u Recognizing and classifying cancers
Topics III Models of genetic change: u Long term: evolutionary changes among species u Reconstructing evolutionary trees from current day sequences u Short term: genetic variations in a population u Finding genes by linkage and association
Topics IV Protein World: u How proteins fold - secondary & tertiary structure u How to predict protein folds from sequences data alone u How to analyze proteins changes from raw experimental measurements (Mass. Spec) u 2 D gels
Class Structure u 2 weekly meeting l Class: Mondays 16 -18 l Targil: Tuesday 18 -20 Grade: u 60% in five question sets l Each contains theoretical problems & practical computer questions u 40% test u 5% bonus for active participation
Exercises & Handouts u Check regularly http: //www. cs. huji. ac. il/~cbio
- Slides: 35