general Activators proteinDNA interaction MBV 4230 The sequence
general: Activators protein-DNA interaction
MBV 4230 The sequence specific activators: transcription factors n Modular design with a minimum of two functional domains 1. DBD - DNA-binding domain ¨ 2. TAD - transactivation domain ¨ n n DBD: several structural motifs classification into TF-families TAD - a few different types N Three classical categories n Acidic domains (Gal 4 p, steroid receptor) n Glutamine-rich domains (Sp 1) n Proline- rich domains (CTF/NF 1) ¨ Mutational analyses - bulky hydrophobic more important than acidic ¨ Unstructured in free state - 3 D in contact with target? DBD ¨ n TAD Most TFs more complex ¨ Regulatory domains, ligand binding domains etc C
MBV 4230 TF classification based on structure of DBD Zinc finger b. Helix-Loop-Helix. Two levels of recognition (Max) 1. Shape recognition An helix fits into the major groove in BDNA. This is used in most interactions 2. Chemical recognition Negatively charged sugar-phosphate chain involved in electrostatic interactions Hydrogen-bonding is crucial for sequence recognition Leucine zipper (Gcn 4 p) p 53 DBD NFk. B STAT dimer
MBV 4230 Alternative classification of TFs on the basis of their regulatory role n Classification questions Is the factor constitutive active or requires a signal for activation? ¨ Does the factor, once synthesized, automatically enter the nucleus to act in transcription? ¨ If the factor requires a signal to become active in transcriptional regulation, what is the nature of that signal? ¨ n Classification system I. Constitutive active nuclear factors ¨ II. Regulatory transcription factors ¨ n n Developmental TFs Signal dependent ¨ ¨ ¨ Steroid receptors Internal signals Cell surface receptor controlled § Nuclear § Cytoplasmic
MBV 4230 Classification - regulatory function Brivanlou and Darnell (2002) Science 295, 813 -
MBV 4230 Sequence specific DNA-binding - essential for activators n n TFs create nucleation sites in promoters for activation complexes Sequence specific DNA-binding crucial role
Principles of sequence specific DNA-binding
MBV 4230 How is a sequence (cis-element) recognized from the outside? Shape recognition Chemical recognition Electrostatic interaction Form/ geometry Hydrogenbonds Hydrophobic interaction
MBV 4230 Complementary forms The dimension of an helix fits the dimensions of the major groove in B-DNA Side chains point outwards and are ideally positioned to engage in hydrogen bonds
MBV 4230 Direct reading of DNA-sequence Recognition of form n n n The dimension of an -helix fits the dimensions of the major groove in B-DNA Most common type of interaction Usually multiple domains participate in recognition dimers of same motif ¨ tandem repeated motif ¨ Interaction of two different motifs ¨ n recognition: detailed fit of complementary surfaces Hydration /water participates ¨ seq specific variation of DNA-structure ¨
MBV 4230 Example n Steroid receptor
MBV 4230 How is a sequence (cis-element) recognized from the outside? Shape recognition Chemical recognition Electrostatic interaction Form/ geometry Hydrogenbonds Hydrophobic interaction
MBV 4230 Next level: chemical recognition - reading of sequence information n Negatively charged sugar-phosphate chain = basis for electrostatic interaction Equal everywhere - no sequencerecognition ¨ Still a main contributor to the strength of binding ¨
MBV 4230 How is a sequence (cis-element) recognized from the outside? Shape recognition Chemical recognition Electrostatic interaction Form/ geometry Hydrogenbonds Hydrophobic interaction
MBV 4230 Recognition by Hydrogen bonding n Hydrogen-bonding is a key element in sequence specific recognition ¨ 10 -20 x in contact surface ¨ Base pairing not exhausted in duplex DNA, free positions point outwards in the major groove D A A
MBV 4230 Docked prot side chains exploit the H-bonding possibilities for interaction n Hydrogen-bonding is essential for sequence specific recognition 10 -20 x in contact interphase ¨ Most contacts in major groove ¨ Purines most important ¨ n A Zif example
MBV 4230 Interaction: Protein side chain - DNA bp n Close up acid side chains points outwards from the -helix and are optimally positioned for base-interaction ¨ Amino
MBV 4230 How is a sequence (cis-element) recognized from the outside? Shape recognition Chemical recognition Electrostatic interaction Form/ geometry Hydrogenbonds Hydrophobic interaction
MBV 4230 Hydrophobic contact points Ile
Homeodomains
MBV 4230 The Homeodomain-family: common DBD-structure n Homeotic genes - biology Regulation of Drosophila development ¨ Striking phenotypes of mutants – body-parts move ¨ Control genetic developmental program ¨ n Homeobox / homeodomain Conserved DNA-sequence “homeobox” in a large number of genes ¨ Encode a 60 aa “homeodomain” ¨ A stably folded structure that binds DNA ¨ Similarity with prokaryotic helix-turn-helix ¨ n 3 D-structure determined for several HDs Drosophila Antennapedia HD (NMR) ¨ Drosophila Engrailed HD-DNA complex (crystal) ¨ Yeast MAT 2 ¨
MBV 4230 Homeodomain-family: common DBD -structure n Major groove contact via a 3 -helix structure helix 3 enters major groove (“recognition helix”) ¨ helix 1+2 antiparallel across helix 3 ¨ ¨ 16 -helical aa conserved n n 9 in hydrophobic core some in DNA-contact interphase (common docking mechanism? )
MBV 4230 Engrailed
MBV 4230 Homeodomain-family: common DBD -structure n Minor groove contacted via Nterminal flexible arm R 3 and R 5 in engrailed and R 7 in MAT 2 contact AT in minor groove ¨ R 5 conserved in 97% of HDs ¨ Deletions and mutants impair DNA-binding ¨ n Loop between helix 1 and 2 determines Ubx versus Antp function Close to DNA ¨ exposed for protein-protein interaction ¨
MBV 4230 HD-paradox: what determines sequence specificity? n n Drosophila Ultrabithorax (Ubx), Antennapedia (Antp), Deformed (Dfd) and Sex combs reduced (Scr): closely similar HD, biological role very different Minor differences in DNA-binding in vitro TAAT-motif bound by most HD-factors ¨ contrast between promiscuity in vitro and specific effects in vivo ¨ n Swaps reveal that surprisingly much of the specificity is determined by the N-terminal arm which contacts the minor groove Swaps: Antp with Scr-type N-term arm shows Scr-type specificity in vivo ¨ Swaps: Dfd with Ubx-type N-term arm shows Ubx-type specificity in vivo ¨ n N-terminal arm more divergent than the rest of HD R 5 and R 7 (contacting DNA) are present in both Ubx, Antp, Dfd, and Scr ¨ Other tail aa diverge much more ¨
MBV 4230 Solutions of the paradox n Conformational effects mediated by N-term arm ¨ n Even if the -helical HDs are very similar, a much larger diversity is found in the N-terminal arms that contact the minor groove Protein-protein interaction with other TFs through the N-terminal arm - enhanced affinity/specificity - the basis of combinatorial control MAT 2 interaction with MCM 1 - cooperative interactions ¨ Ultrabithorax- Extradenticle in Drosophila ¨ Hox-Pbx 1 in mammals ¨
MBV 4230 Combinatorial TFs give enhanced specificity n n TFs encoded by the homeotic (Hox) genes govern the choice between alternative developmental pathways along the anterior– posterior axis. Hox proteins, such as Drosophila Ultrabithorax, have low DNA-binding specificity by themselves but gain affinity and specificity when they bind together with the homeoprotein Extradenticle (or Pbx 1 in mammals).
MBV 4230 N-tail in protein-protein interaction - adopt different conformations HD HD Mat- 2/Mcm-1 Conformation determined by prot-prot interaction
MBV 4230 It works impressively well n Hox genes
POU family
MBV 4230 POU-family: common DBD-structure n The POU-name : Pit-1 pituitary specific TF ¨ Oct-1 and Oct-2 lymphoid TFs ¨ Unc 86 TF that regulates neuronal development in C. elegans ¨ n A bipartite 160 aa homeodomain-related DBD a POU-type HD subdomain (C-terminally located) ¨ et POU-specific subdomain (N-terminally located) ¨ Coupled by a variable linker (15 -30 aa) ¨ n POU is a structurally bipartite motif that arose by the fusion of genes encoding two different types of DNA-binding domain.
MBV 4230 POU: Two independent subdomains n POUHD subdomain ¨ ¨ ¨ n POUspec subdomain ¨ ¨ ¨ n 60 aa closely similar to the classical HD Only weakly DNA-binding by itself (<HD) contacts 3´-half site (Oct-1: ATGCAAAT) docking similar to engrailed. Antp etc Main contribution to non-specific backbone contacts 75 aa POU-specific domain enhances DNA-affinity 1000 x contacts 5´-half site (Oct-1: ATGCAAAT) contacts opposite side of DNA relative to HD structure similar to prokaryotic - and 434 -repressors The two-part DNA-binding domain partially encircles the DNA.
MBV 4230 Flexible DNA-recognition n POU-domains have intrinsic conformational flexibility ¨ n and this feature appears to confer functional diversity in DNArecognition The subdomains are able to assume a variety of conformations, dependent on the DNA element.
ZNFs: zinc finger families
MBV 4230 Zinc finger proteins n n n Zinc finger proteins were first discovered as transcription factors. Zinc finger proteins are among the most abundant proteins in eukaryotic genomes. Their functions are extraordinarily diverse ¨ ¨ ¨ n include DNA recognition, RNA packaging, transcriptional activation, regulation of apoptosis, protein folding and assembly, and lipid binding. Zinc finger structures are as diverse as their functions.
MBV 4230 Examples C 2 H 2 -type Zif fra Sp 1 (3. fngr) C 4 -type Zif fra GATA-1 Zn++ LIM-domain type Zif fra ACRP PKC-type Zif
The C 2 H 2 subfamily
MBV 4230 Classical TFIIIA-related zinc fingers: n x [Zn-C 2 H 2] n History: Xenopus TFIIIA the first isolated and cloned eukaryotic TF Function: activation of 5 S RNA transcription (RNAPIII) ¨ Rich source : accumulated in immature Xenopus oocyttes as “storage particles” = TFIIIA+5 S RNA (≈ 15% of total soluble protein) ¨ Purified 1980, cloned in 1984 ¨ Mr= 38 600, 344 aa ¨ n Primary structure TFIIIA Composed of repeats: 9 x 30 aa minidomains + 70 aa unique region C-trm ¨ Each minidomain conserved pattern of 2 Cys+2 His ¨ Hypothesis: each minidomain structured around a coordinated zinc ion (confirmed later) ¨ ++ Zn++ Zn++ Zn. Zn
MBV 4230 Zinc finger proteins n n Finger-like in 2 D Not in 3 D
MBV 4230 Common features of TFIIIA-related zinc fingers n n n Consensus for each finger: FXCX 2 -5 CX 3 FX 5 FX 2 HX 2 -5 H Number of fingers in related factors varies: 2 -37 Number of members exceptionally high S. cerevisiae genome: 34 C 2 H 2 zinc fingers ¨ C. elegans genome 68 C 2 H 2 zinc fingers ¨ Drosophila genome 234 C 2 H 2 zinc fingers ¨ Humane genome 564 C 2 H 2 zinc fingers, (135 C 3 HC 4 zinc finger) ¨ n We now recognize the classical C 2 H 2 zinc finger as the first member of a rapidly expanding family of zinc-binding modules.
MBV 4230 3 D structure of the classical C 2 H 2 -type of zinc fingers n Each finger = a minidomain with -structure each finger an independent module ¨ Several fingers linked together by flexible linkers ¨ First 3 D structure: the 3 -finger Zif 268 (mouse) ¨ n DNA interaction in Zif 268 major groove contact through -helix in ¨ recognition of base triplets ¨ aa in three positions responsible for sequence recognition: -1, 3 and 6 (rel. til -helix) ¨ Simple one-to-one pattern (contact aa - baser) can a recognition code be defined ? ? ¨ n DNA interaction in GLI and TTK differs different phosphate contact ¨ distortion of DNA ¨ finger 1 without DNA contact ¨ D N A
MBV 4230 The Zif 268 prototype n Finger 2 from Zif 268 ¨ n including the two cysteine side chains and two histidine side chains that coordinate the zinc ion DNA-recognition residues ¨ indicated by the numbers identifying their position relative to the start of the recognition helix
MBV 4230 Three fingers in Zif 268 n n Zif 268 - first multifinger structure recognition of base triplets Finger 3 DNA Finger 1 Finger 2 LINKER
MBV 4230 Recognition code? n The DNA sequence of the Zif 268 site is color coded to indicate base contacts made by each finger.
MBV 4230 Structure of the six-finger TFIIIA– DNA complex n In a multi-finger protein some fingers contact base pairs and some will not, but rather function as bridges Fingers 1– 2– 3, separated by typical linkers, wrap smoothly around the major groove like those of Zif 268 ¨ In contrast, fingers 4– 5– 6 form an open, extended structure running along one side of the DNA. Of these, only finger 5 makes contacts with bases in the major groove. The flanking fingers, 4 and 6, appear to serve primarily as spacer elements. ¨
Nuclear receptors 2 x. C 4
MBV 4230 Nuclear receptors: 2 x[Zn-C 4]: n n n Large family where DBD binds two Zn++ through a tetraedrical pattern of Cys conserved DBD 70 -80 aa Protein structure Two “zinc fingers” constitute one separate domain ¨ Two -helices with C 3 -Zn-C 4 N-terminally ¨ These perpendicular on the top of each other with hydrophobic interactions ¨ n n Mediates trx response to complex extra cellular signals Evolutionary coupled to multi cellular organisms Yeast = 0 but C. elegans 233 or 1. 5% of genes !! ¨ Sequence prediction: 90% with nuclear receptor DBD has potential ligand-BD ¨ Implies that lipophilic signal molecules have been important to establish communication between cells ¨
MBV 4230 DNA-binding by nuclear receptors
MBV 4230 Nuclear receptors - DNA interaction n 3 D Prot-DNA structure ¨ n n Dimer in complex (monomer in solution) DNA interaction ¨ ¨ ¨ n glucocorticoid receptor + estrogen receptor First “finger” binds DNA Second “finger” involved in dimerization Binds to neighboring “major grooves” on same side of DNA Extensive phosphate contact and recognition helix docked into the groove specificity determined by 3 aa (E 2, G 3, A 6) in recognition helix Structured dimer interphase formed upon DNAbinding
GATA factors
MBV 4230 GATA-factors: 1 x [Zn-C 4]: n n Small family Prototype erythroid TF: GATA-1 (2 fingers) C-terminal finger – DNA binding ¨ N-terminal finger – protein interactions? ¨ n n From fungi to humans Structure ≈ 1. finger in nuclear receptors Hydrophobic DNA interphase Evolutionary implications ¨ Early duplication of primitive finger divergent functions developed in NR
Gal 4 p factors
MBV 4230 GAL 4 -related factors: 1 x [Zn 2 -C 6]: n GAL 4 -DBD = 28 aa cys-rich domain binds 2 Zn++ ¨ + 26 aa C-terminal domain involv. in dimerization ¨ n Cys-rich domain consensus: CX 2 CX 6 CX 2 CX 6 C ¨ A Zn-Cys cluster with shared Cys (1. and 4. ) ¨ Two short -helices with C-Zn-C N-terminal ¨
MBV 4230 GAL 4 -related factors: 1 x [Zn 2 -C 6]: n Dimerization domain Monomer in solution, dimer in DNA-complex ¨ In solution only Cys-rich motif structured ¨ In complex forms two extended helix-strand motives ¨ Amfipathic helices form a dimer-interphase in the complex ¨ n DNA interaction ¨ ¨ ¨ contacts CGG-triplets in major groove C-terminal of 1. -helix contacts bases Phosphate contact via helix-strand motif Coiled-coil dimer-interphase at right angle to DNA (≈b. ZIP) Linker determines spacing of CGG-triplet: 11 bp in GAL 4, 6 bp in PPR 1
- Slides: 54