DNAProtein structurefunction analysis and prediction Lecture 11 DNARNA

  • Slides: 33
Download presentation
DNA/Protein structure-function analysis and prediction Lecture 11: DNA/RNA structure

DNA/Protein structure-function analysis and prediction Lecture 11: DNA/RNA structure

Central Dogma of Molecular Biology Replication DNA Transcription m. RNA Translation Protein Transcription is

Central Dogma of Molecular Biology Replication DNA Transcription m. RNA Translation Protein Transcription is carried out by RNA polymerase (II) Translation is performed on ribosomes Replication is carried out by DNA polymerase Reverse transcriptase copies RNA into DNA Transcription + Translation = Expression

But DNA can also be transcribed into non-coding RNA … qt. RNA (transfer): transfer

But DNA can also be transcribed into non-coding RNA … qt. RNA (transfer): transfer of amino acids to the ribosome during protein synthesis. qr. RNA (ribosomal): essential component of the ribosomes (complex with r. Proteins). qsn. RNA (small nuclear): mainly involved in RNA-splicing (removal of introns). sn. RNPs. qsno. RNA (small nucleolar): involved in chemical modifications of ribosomal RNAs and other RNA genes. sno. RNPs. q. SRP RNA (signal recognition particle): form RNA-protein complex involved in m. RNA secretion. q. Further: micro. RNA, e. RNA, g. RNA, tm. RNA etc.

RNA editing

RNA editing

Eukaryotes have spliced genes … q q q Promoter: involved in transcription initiation (TF/RNApol-binding

Eukaryotes have spliced genes … q q q Promoter: involved in transcription initiation (TF/RNApol-binding sites) TSS: transcription start site UTRs: un-translated regions (important for translational control) Exons will be spliced together by removal of the Introns Poly-adenylation site important for transcription termination (but also: m. RNA stability, export m. RNA from nucleus etc. )

DNA makes m. RNA makes Protein

DNA makes m. RNA makes Protein

DNA makes m. RNA makes Protein m. RNA

DNA makes m. RNA makes Protein m. RNA

Some facts about human genes q There about 20. 000 – 25. 000 genes

Some facts about human genes q There about 20. 000 – 25. 000 genes in the human genome (~ 3% of the genome) q Average gene length is ~ 8. 000 bp q Average of 5 -6 exons per gene q Average exon length is ~ 200 bp q Average intron length is ~ 2000 bp q 8% of the genes have a single exon q Some exons can be as small as 1 or 3 bp

DMD: the largest known human gene q The largest known human gene is DMD,

DMD: the largest known human gene q The largest known human gene is DMD, the gene that encodes dystrophin: ~ 2. 4 milion bp over 79 exons q X-linked recessive disease (affects boys) q Two variants: Duchenne-type (DMD) and Becker-type (BMD) q Duchenne-type: more severe, frameshift -mutations Becker-type: milder phenotype, “in frame”- mutations Posture changes during progression of Duchenne muscular dystrophy

Nucleic acid basics q Nucleic acids are polymers nucleotide nucleoside q Each monomer consists

Nucleic acid basics q Nucleic acids are polymers nucleotide nucleoside q Each monomer consists of 3 moietics

Nucleic acid basics (2) q A base can be of 5 rings q Purines

Nucleic acid basics (2) q A base can be of 5 rings q Purines and Pyrimidines can base-pair (Watson- Crick pairs) Watson and Crick, 1953

Nucleic acid as hetero-polymers q Nucleosides, nucleotides (Ribose sugar, RNA precursor) q DNA and

Nucleic acid as hetero-polymers q Nucleosides, nucleotides (Ribose sugar, RNA precursor) q DNA and RNA strands (2’-deoxy ribose sugar, DNA precursor) REMEMBER: ü ü (2’-deoxy thymidine triphosphate, nucleotide) ü DNA = deoxyribonucleotides; RNA = ribonucleotides (OH-groups at the 2’ position) Note the directionality of DNA (5’-3’ & 3’-5’) or RNA (5’-3’) DNA = A, G, C, T ; RNA = A, G, C, U

So … DNA RNA

So … DNA RNA

Stability of base-pairing q C-G base pairing is more stable than A-T (A-U) base

Stability of base-pairing q C-G base pairing is more stable than A-T (A-U) base pairing (why? ) q 3 rd codon position has freedom to evolve (synonymous mutations) q Species can therefore optimise their G-C content (e. g. thermophiles are GC rich) (consequences for codon use? ) Thermocrinis ruber, heat-loving bacteria

DNA compositional biases q Base compositions of genomes: G+C (and therefore also A+T) content

DNA compositional biases q Base compositions of genomes: G+C (and therefore also A+T) content varies between different genomes q The GC-content is sometimes used to classify organism in taxonomy q High G+C content bacteria: Actinobacteria e. g. in Streptomyces coelicolor it is 72% Low G+C content: Plasmodium falciparum (~20%) q Other examples: Saccharomyces cerevisiae (yeast) 38% Arabidopsis thaliana (plant) 36% Escherichia coli (bacteria) 50%

Genetic diseases: cystic fibrosis q Known since very early on (“Celtic gene”) q Autosomal,

Genetic diseases: cystic fibrosis q Known since very early on (“Celtic gene”) q Autosomal, recessive, hereditary disease (Chr. 7) q Symptoms: q In exocrine glands (which produce sweat and mucus) q Abnormal secretions q Respiratory problems q Reduced fertility and (male) anatomical anomalies 3, 000 30, 000 20, 000

cystic fibrosis (2) q Gene product: CFTR (cystic fibrosis transmembrane conductance regulator) q CFTR

cystic fibrosis (2) q Gene product: CFTR (cystic fibrosis transmembrane conductance regulator) q CFTR is an ABC (ATP-binding cassette) transporter or traffic ATPase. q These proteins transport molecules such as sugars, peptides, inorganic phosphate, chloride, and metal cations across the cellular membrane. q CFTR transports chloride ions (Cl-) ions across the membranes of cells in the lungs, liver, pancreas, digestive tract, reproductive tract, and skin.

cystic fibrosis (3) q CF gene CFTR has 3 -bp deletion leading to Del

cystic fibrosis (3) q CF gene CFTR has 3 -bp deletion leading to Del 508 (Phe) in 1480 aa protein (epithelial Cl- channel) q Protein degraded in ER instead of inserted into cell membrane Diagram depicting the five domains of the CFTR membrane protein (Sheppard 1999). The delta. F 508 deletion is the most common cause of cystic fibrosis. The isoleucine (Ile) at amino acid position 507 remains unchanged because both ATC and ATT code for isoleucine Theoretical Model of NBD 1. PDB identifier 1 NBD as viewed in Protein Explorer http: //proteinexplorer. org

Let’s return to DNA and RNA structure … q Unlike three dimensional structures of

Let’s return to DNA and RNA structure … q Unlike three dimensional structures of proteins, DNA molecules assume simple double helical structures independent on their sequences. q There are three kinds of double helices that have been observed in DNA: type A, type B, and type Z, which differ in their geometries. q RNA on the other hand, can have as diverse structures as proteins, as well as simple double helix of type A. q The ability of being both informational and diverse in structure suggests that RNA was the prebiotic molecule that could function in both replication and catalysis (The RNA World Hypothesis). q In fact, some viruses encode their genetic materials by RNA (retrovirus)

Three dimensional structures of double helices Side view: A-DNA, B-DNA, Z-DNA Space-filling models of

Three dimensional structures of double helices Side view: A-DNA, B-DNA, Z-DNA Space-filling models of A, B and Z- DNA Top view: A-DNA, B-DNA, Z-DNA

Major and minor grooves (1)

Major and minor grooves (1)

Major and minor grooves (2) q The major groove is approximately 50% wider than

Major and minor grooves (2) q The major groove is approximately 50% wider than the minor. q Proteins that interact with DNA often make contact with the edges of the base pairs that protrude into the major groove.

Forces that stabilize nucleic acid double helix q There are two major forces that

Forces that stabilize nucleic acid double helix q There are two major forces that contribute to stability of helix formation: ü Hydrogen bonding in base-pairing ü Hydrophobic interactions in base stacking 5’ 3’ 3’ 5’ Same strand stacking cross-strand stacking

Types of DNA double helix q Type A q Type B q Type Z

Types of DNA double helix q Type A q Type B q Type Z major conformation RNA minor conformation DNA major conformation DNA minor conformation DNA Right-handed helix Short and broad Right-handed helix Long and thin Left-handed helix Longer and thinner

Right handed B-DNA

Right handed B-DNA

Secondary structures of Nucleic acids q DNA is primarily in duplex form q RNA

Secondary structures of Nucleic acids q DNA is primarily in duplex form q RNA is normally single stranded which can have a diverse form of secondary structures other than duplex.

Non B-DNA Secondary structures q Cruciform DNA q Slipped DNA q Triple helical DNA

Non B-DNA Secondary structures q Cruciform DNA q Slipped DNA q Triple helical DNA Hoogsteen basepairs Source: Van Dongen et al. (1999) , Nature Structural Biology 6, 854 - 859

More Secondary structures q RNA pseudoknots q Cloverleaf r. RNA structure 16 S r.

More Secondary structures q RNA pseudoknots q Cloverleaf r. RNA structure 16 S r. RNA Secondary Structure Based on Phylogenetic Data Source: Cornelis W. A. Pleij in Gesteland, R. F. and Atkins, J. F. (1993) THE RNA WORLD. Cold Spring Harbor Laboratory Press.

3 D structures of RNA : transfer-RNA structures q Secondary structure of t. RNA

3 D structures of RNA : transfer-RNA structures q Secondary structure of t. RNA (cloverleaf) q Tertiary structure of t. RNA

3 D structures of RNA : ribosomal-RNA structures q Secondary structure of large r.

3 D structures of RNA : ribosomal-RNA structures q Secondary structure of large r. RNA (16 S) q Tertiary structure of large r. RNA subunit Ban et al. , Science 289 (905 -920), 2000

3 D structures of RNA : Catalytic RNA q Secondary structure of self-splicing RNA

3 D structures of RNA : Catalytic RNA q Secondary structure of self-splicing RNA q Tertiary structure of self-splicing RNA

Some structural rules … q Base-pairing is stabilizing q Un-paired sections (loops) destabilize q

Some structural rules … q Base-pairing is stabilizing q Un-paired sections (loops) destabilize q 3 D conformation with interactions makes up for this

Final notes q Sense/anti-sense RNA antisense RNA blocks translation through hybridization with coding strand

Final notes q Sense/anti-sense RNA antisense RNA blocks translation through hybridization with coding strand Example. Tomatoes synthesize ethylene in order to ripe. Transgenic tomatoes have been constructed that carry in their genome an artificial gene (DNA) that is transcribed into an antisense RNA complementary to the m. RNA for an enzyme involved in ethylene production tomatoes make only 10% of normal enzyme amount. q Sense/anti-sense peptides Have been therapeutically used Especially in cancer and anti-viral therapy 1. Sense/anti-sense proteins Does it make (anti)sense? Codons for hydrophilic and hydrophobic amino acids on the sense strand may sometimes be complemented, in frame, by codons for hydrophobic and hydrophilic amino acids on the antisense strand. Furthermore, antisense proteins may sometimes interact with high specificity with the corresponding sense proteins… BUT VERY RARE: HIGHLY CONSERVED CODON BIAS