Properties of TFIIIA Protein ProblemOriented Learning II YMUBiochem
Properties of TFIIIA Protein Problem-Oriented Learning (II) YMU-Biochem
Standard Analysis of A Cloned Gene Database Search Find Similar Sequence? Yes No Multiple Sequence Alignment Gene Searching Fragment Assembly Profile Analysis No Consistent? Yes Pattern Recognition Protein Analysis Publication Yes Find Consensus? No Phylogenetic Analysis YMU-Biochem
Pattern Recognition YMU-Biochem
Jargon “Motifs” in bioinformatics a particular cluster of residue types as identification of an important function “Motifs” in protein structures supersecondary structures including a few secondary structure elements such as helix and sheet e. g. Greek key, hairpin… YMU-Biochem
Motifs Can Provide Insights on Possible Protein Functions Part of prosite. seqcat: … Atp_Gtp_A ATP/GTP-binding site motif A (P-loop). 11/19 (0017. pdoc) (A, G)x 4 GK(S, T) Atpase_Na_K_Beta_1 Sodium and potassium ATPases beta subunits signature 1. 11/19 (0328. pdoc) (F, Y, W)… Atpase_Na_K_Beta_2 Sodium and potassium ATPases beta subunits signature 2. 11/19 (0328. pdoc) (R, K)x 2… YMU-Biochem
Two Ways to Recognize Protein Patterns “Motifs” looks for PROSITE patterns in a sequence Pattern 1 Pattern 2 Pattern 3 Pattern 4 “Find. Patterns” identifies sequences that contain given patterns Seq 1 Seq 2 Seq 3 Seq 4 Seq 5 YMU-Biochem
Demo 1 Examine the known motifs on TFIIIA protein sequence YMU-Biochem
Conventions to Express Patterns in Prosite Pattern of Zinc_Finger_C 2 H 2: Cx{2, 4}Cx 3(L, I, V, M, F, Y, W, C)x 8 Hx{3, 5}H x 8 means 8 any amino acids X{2, 4} means 2 to 4 X (any amino acids) (L, I, V) means L, I, or V at this location x{, 4} means 0~4 any amino acids x{2, } means 2~350, 000 any amino acids YMU-Biochem
Demo 2 How to Get The Documentation of A Pattern? (Zinc_Finger_C 3 hc 4) YMU-Biochem
If You Find A New Motif, and You Want to Check Its Exist in Other Sequences… Define Your Own Patterns in prosite. patterns “Motifs” Search YMU-Biochem
Format of Prosite. patterns Name Offset Pattern . . PDoc_Name 11 s_Seed_Storage 1 NGx(D, E)2 x(L, I, V, M, F)C(S, T)x{11, 12}(P, A, G)D 0284. pdoc 1433_1 1 RNL(L, I, V)S(V, G)(G, A)YKN(I, V) 0633. pdoc 1433_2 1 YK(D, E)STLI(I, M)QLL(R, H)DN(L, F)TLW(T, A)(S, A) 0633. pdoc 25 a_Synth_1 1 GGSx(A, G)(K, R)x. Tx. L(K, R)(G, S, T)x. SD(A, G) 0653. pdoc 25 a_Synth_2 1 RPVILDPx(D, E)PT 0653. pdoc (Add your own patterns here. Name, offset, pattern, and pdoc_name, are required. ) AIDS 1 AIDS aids. pdoc YMU-Biochem
Standard Analysis of A Cloned Gene Database Search Find Similar Sequence? Yes No Multiple Sequence Alignment Gene Searching Fragment Assembly Profile Analysis No Consistent? Yes Pattern Recognition Protein Analysis Publication Yes Find Consensus? No Phylogenetic Analysis YMU-Biochem
Protein Analysis Pep. Plot Peptide. Structure/Plot. Structure Moment Helical. Wheel SPScan New Programs in Ver 9. 1 Ht. HScan Coil. Scan YMU-Biochem
Some Important Properties of Proteins Secondary structure or higher Antigenicity, flexibility, hydrophilicity, and surface probability YMU-Biochem
Secondary Structure Information May Connote Interactions between Molecules e. g. Leucines are placed along one side of the alpha helix of leucine zipper YMU-Biochem
Pep. Plot vs Peptide. Structure predicts the properties Pep. Plot reveals the tendencies of properties YMU-Biochem
Pep. Plot Reveals Tendencies of Properties YMU-Biochem
Demo 3 Use Peptide. Structure to predict the secondary structure and other properties of TFIIIA YMU-Biochem
Output of Peptide. Structure Predicts the Properties at Every Location Pos 1 2 3 4 5 6 7 8 9 10 … AA Glyco. S Hy. Phil Surf. Pr Flex. Pr M G E K A L P V V Y . . 1. 475 0. 820 0. 050 0. 271 -0. 057 -0. 714 -1. 029 -0. 129 0. 600 1. 270 1. 004 0. 648 1. 012 0. 759 0. 325 0. 255 0. 505 1. 198 1. 214 1. 000 0. 996 0. 969 0. 945 0. 941 0. 952 0. 967 CF-Pred GORPred AI-Ind. . H H H. B B B H H H H B B 0. 900 0. 450 0. 600 -0. 300 -0. 600 -0. 150 0. 750 YMU-Biochem
Correlation of Zn-Finger and the Predicted -Helix in TFIIIA Zn-Finger 1 2 3 4 5 6 7 8 9 Range 13 -17 38 -67 68 -98 99 -129 130 -159 160 -188 189 -214 215 -246 247 -276 GOR 28 -34 46 -59 82 -96 109 -124 151 -158 184 -188 189 -200, 207 -212 none 260 -276 Chou & Fasman* 28 -37 (h) 46 -51 (H) 81 -94 (h) 114 -119 (H) 151 -159 (h) 182 -188 (h) 200 -214 (h) 215 -217, 240 -246 (h) 247 -257, 263 -276 (h) *H: stronger helix former, h: weak helix former YMU-Biochem
Exercise 1 Print the squiggles output of the predicted secondary structure with hydrophilicity on it. YMU-Biochem
Protein Analysis Pep. Plot Peptide. Structure/Plot. Structure Moment Helical. Wheel SPScan New Programs in Ver 9. 1 Ht. HScan Coil. Scan YMU-Biochem
Exercise 2 Use Helical. Wheel to examine the distribution of hydrophobic residues in 115 -129 aa YMU-Biochem
Demo 4 Use Moment to examine the hydrophobic property of TFIIIA protein YMU-Biochem
Standard Analysis of A Cloned Gene Database Search Find Similar Sequence? Yes No Multiple Sequence Alignment Gene Searching Fragment Assembly Profile Analysis No Consistent? Yes Pattern Recognition Protein Analysis Publication Yes Find Consensus? No Phylogenetic Analysis YMU-Biochem
Multiple Sequence Alignment YMU-Biochem
Programs in GCG Suite for Multiple Sequence Alignment Pile. Up Pretty Line. Up YMU-Biochem
Demo 5 Align all the TFIIIA c. DNA sequences in Gen. EMBL database using Pile. Up. YMU-Biochem
Concept of List File List of path and file names of database seq Sequence attributes (only work in some programs) range (specified by Begin and End) weight (specified by wgt or weight) Programs which generate list files e. g. Fast. A, String. Search, --- etc. see “Using Sequences” in “User’s Guide” YMU-Biochem
How to Use A List File? %pileup @tfiiia. list YMU-Biochem
Exercise 3 Align all the TFIIIA sequences in Swiss. Prot database. YMU-Biochem
Multiple Sequence File (msf) Format tf 3 adna. msf MSF: 1761 Type: N April 11, 1998 22: 25 Check: 1506. . Name: XBTF 3 A Len: 1761 Check: 2616 Weight: 1. 00 Name: XELTFIIIA Len: 1761 Check: 3532 Weight: 1. 00 Name: RANTFIIIA Len: 1761 Check: 7914 Weight: 1. 00 Name: HSU 14134 Len: 1761 Check: 9540 Weight: 1. 00 Name: YSCTFIIIA Len: 1761 Check: 7904 Weight: 1. 00 // … 101 150 XBTF 3 A ~~GAATTCGC GGCCGCAGTT GCGAGGAAGC CATAGAAAGT CATAGGAAGT XELTFIIIA ~~~~~~~~~~GAA TTCCGGAAGC CGAGGGCTGT RANTFIIIA ~~~~~~~~~~CTTTAA GTATTGGGAA CCCGGGACGG HSU 14134 CCGTGGTCGC CGAGTCGGTG TCGTCCTTGA CCATCGCCGA CGCGTTCATT YSCTFIIIA AAAAAT TTTTCA ATTAAATATC ATACTATACG AATAATCAAT YMU-Biochem
Demo 6 Use Pretty to show the multiple sequence alignment of TFIIIA DNA sequences and display consensus sequences in “*” YMU-Biochem
How to Use A Msf File? %pretty tfiiia. msf{*} YMU-Biochem
Exercise 4 Use Pretty to show the multiple sequence alignment of TFIIIA Protein Sequences display consensus sequence display the most conserved consensus sequences with “*” YMU-Biochem
Answer gcg% pretty -con -dif=* -in=tf 3 asw. msf{*} -out= tf 3 asw. pretty YMU-Biochem
What You Have Already Learned How to use Programs: Motifs, Peptide. Structure, Plotstructure, Pileup, and Pretty Define your own patterns Make multiple sequence alignment Concept of list files YMU-Biochem
- Slides: 37