Protein 1 Structure Interactions Last week Protein interaction

  • Slides: 65
Download presentation
Protein 1: Structure &Interactions (Last week) • Protein interaction codes(s)? • Real world programming

Protein 1: Structure &Interactions (Last week) • Protein interaction codes(s)? • Real world programming • Pharmacogenomics : SNPs • Chemical diversity : Nature/Chem/Design • Target proteins : structural genomics • Folding, molecular mechanics & docking • Toxicity animal/clinical : cross-talk 1

Protein 2: Properties & Quantitation • Separation of proteins & peptides • Protein localization

Protein 2: Properties & Quantitation • Separation of proteins & peptides • Protein localization & complexes • Peptide identification (MS/MS) – Database searching & sequencing. • Protein quantitation – Absolute & relative • Protein modifications & crosslinking • Protein - metabolite quantitation 2

Why purify? • Reduce one source of noise (in identification/quantitation) • Prepare materials for

Why purify? • Reduce one source of noise (in identification/quantitation) • Prepare materials for in vitro experiments (sufficient causes) • Discover biochemical properties 3

(Protein) Purification Methods • Charge: ion-exchange chromatography, isoelectric focusing • Size: dialysis, gel-filtration chromatography,

(Protein) Purification Methods • Charge: ion-exchange chromatography, isoelectric focusing • Size: dialysis, gel-filtration chromatography, gel-electrophoresis, sedimentation velocity • Solubility: salting out • Hydrophobicity: Reverse phase chromatography • Specific binding: affinity chromatography • Complexes: Immune precipitation (± crosslinking) • Density: sedimentation equilibrium 4

Protein Separation by Gel Electrophoresis • Separated by mass: Sodium dodecyl sulfate (SDS) polyacrylamide

Protein Separation by Gel Electrophoresis • Separated by mass: Sodium dodecyl sulfate (SDS) polyacrylamide gel electrophoresis. – Sensitivity: 0. 02 ug protein with a silver stain. – Resolution: 2% mass difference. • Separated by isoelectric point (p. I): polyampholytes p. H gradient gel. – Resolution: 0. 01 p. I. 5

Comparison of predicted with observed protein properties (localization, postsynthetic modifications) E. coli Link et

Comparison of predicted with observed protein properties (localization, postsynthetic modifications) E. coli Link et al. 1997 Electrophoresis 18: 1259 -313 (Pub) 6

Computationally checking proteomic data Property Basis of calculation Protein charge Protein mass Peptide LC

Computationally checking proteomic data Property Basis of calculation Protein charge Protein mass Peptide LC Subcellular Expression RKHYCDE (N, C), p. Ka, p. H (Pub) Calibrate with knowns (complexes) Isotope sum (incl. modifications) aa composition linear regression Hydrophobicity, motifs (Pub) Codon Adaptation Index (CAI) 7

Protein 2: Today's story & goals • Separation of proteins & peptides • Protein

Protein 2: Today's story & goals • Separation of proteins & peptides • Protein localization & complexes • Peptide identification (MS/MS) – Database searching & sequencing. • Protein quantitation – Absolute & relative • Protein modifications & crosslinking • Protein - metabolite quantitation 8

Cell fraction: Periplasm 2 D gel: SDS mobility isoelectic p. H Mr Link et

Cell fraction: Periplasm 2 D gel: SDS mobility isoelectic p. H Mr Link et al. 1997 Electrophoresis 18: 1259 -313 (Pub) 9

Cell localization predictions Target. P: using N-terminal sequence discriminates mitochondrion, chloroplast, secretion, & "other"

Cell localization predictions Target. P: using N-terminal sequence discriminates mitochondrion, chloroplast, secretion, & "other" localizations with a success rate of 85%. (pub) Gromiha 1999, Protein Eng 12: 557 -61. A simple method for predicting transmembrane alpha helices with better accuracy. (pub) Using the information from the topology of 70 membrane proteins. . . correctly identifies 295 transmembrane helical segments in 70 membrane proteins with only two overpredictions. 10

Isotope calculations Mass resolution 0. 1% vs. Symbol -----H(1) C(12) N(14) O(16) S(32) Mass

Isotope calculations Mass resolution 0. 1% vs. Symbol -----H(1) C(12) N(14) O(16) S(32) Mass -----1. 007825 12. 000000 14. 003074 15. 994915 31. 972072 1 ppm Abund. -----99. 99 98. 90 99. 63 99. 76 95. 02 Symbol -----H(2) C(13) N(15) O(17) S(33) Mass -----2. 014102 13. 003355 15. 000109 16. 999131 32. 971459 Abund. ------0. 015 1. 10 0. 37 0. 038 0. 75 11

Computationally checking proteomic data Property Basis of calculation Protein charge Protein mass Peptide LC

Computationally checking proteomic data Property Basis of calculation Protein charge Protein mass Peptide LC Subcellular Expression RKHYCDE (N, C), p. Ka, p. H (Pub) Calibrate with knowns (complexes) Isotope sum (incl. modifications) aa composition linear regression Hydrophobicity, motifs (Pub) Codon Adaptation Index (CAI) 12

trypsin High Performance Liquid Chromatography 13

trypsin High Performance Liquid Chromatography 13

Mobile Phase of HPLC • The interaction between the mobile phase and sample determine

Mobile Phase of HPLC • The interaction between the mobile phase and sample determine the migration speed. – Isocratic elution: constant migration speed in the column. – Gradient elution: gradient migration speed in the column. 14

Stationary Phase of HPLC • The degree of interaction with samples determines the migration

Stationary Phase of HPLC • The degree of interaction with samples determines the migration speed. – Liquid-Solid: polarity. – Liquid-Liquid: polarity. – Size-Exclusion: porous beads. – Normal Phase: hydrophilicity and lipophilicity. – Reverse Phase: hydrophilicity and lipophilicity. – Ion Exchange. – Affinity: specific affinity. 15

Sereda, T. et al. “Effect of the α-amino group on peptide retention behaviour in

Sereda, T. et al. “Effect of the α-amino group on peptide retention behaviour in reversed-phase chromatography. (The calculated curve is displaced upward for clarity) Wilce, et al. “High-performance liquid chromatography of amino acids, peptides and proteins. ” Journal of Chromatography, 632 (1993) 11 -18. RP-LC calculated observed Empirical linear regression varies with type of LC-material W F L I M Y V C P E A D G T S Q N R H K a-NH 3+? C 18 no yes no 10. 1 9. 3 9. 8 8. 8 5. 5 8. 8 7. 5 4. 6 9. 5 5. 8 3. 0 8. 4 4. 8 3. 0 2. 6 4. 5 3. 1 6. 1 3. 5 1. 3 4. 9 3. 4 2. 9 0. 5 2. 7 0. 7 2. 8 0. 3 0. 5 0. 8 0. 2 0. 1 1. 7 0. 0 0. 6 1. 1 0. 0 0. 4 -0. 1 1. 0 1. 8 -0. 1 0. 3 -0. 9 0. 0 -0. 7 -3. 0 -2. 1 0. 0 -3. 1 -2. 1 2. 4 -3. 3 -1. 5 0. 6 -3. 5 -1. 6 0. 0 16

A Map is Like a 2 D Peptide Gel First Dimension: Reverse Phase Chromatography

A Map is Like a 2 D Peptide Gel First Dimension: Reverse Phase Chromatography Separation By Hydrophobicity R A P S V MW C A G F L E K Q G RT min C C T K D m/z Second Dimension: Mass Spectrometry Separation by Mass 17

What Information Can Be Extracted From A Single Peptide Peak abundance Isotopic Variants of

What Information Can Be Extracted From A Single Peptide Peak abundance Isotopic Variants of DAFLGSFLYEYSR rt m/z abundance @ 36. 418 min 0 X 13 C 1 X 13 C 2 X 13 C 3 X 13 C K. Leptos 2001 18 m/z

Directed Analysis of Large Protein Complexes by 2 D separation: strong cation exchange and

Directed Analysis of Large Protein Complexes by 2 D separation: strong cation exchange and reversed-phased liquid chromatography. Link, et al. 1999, Nature Biotech. 17: 676 -82. (Pub) 19

A new 40 S subunit protein >9 5 -9 2 -4 1 #peptides #uniquely

A new 40 S subunit protein >9 5 -9 2 -4 1 #peptides #uniquely identified / #genes 1/1 2/2 1/2 0/2 20

Protein 2: Properties & Quantitation • Separation of proteins & peptides • Protein localization

Protein 2: Properties & Quantitation • Separation of proteins & peptides • Protein localization & complexes • Peptide identification (MS/MS) – Database searching & sequencing. • Protein quantitation – Absolute & relative • Protein modifications & crosslinking • Protein - metabolite quantitation 21

The Finnigan LCQ: An ESI-QIT Mass Spectrometer Electro-Spray Ionization chamber Mass Analyzer/Detector 22

The Finnigan LCQ: An ESI-QIT Mass Spectrometer Electro-Spray Ionization chamber Mass Analyzer/Detector 22

Tandem Mass Spectrometry Quadrople Q 1 scans or selects m/z. Q 2 transmits those

Tandem Mass Spectrometry Quadrople Q 1 scans or selects m/z. Q 2 transmits those ions through collision gas (Ar). Q 3 Analyzes the resulting fragment ions. Siuzdak, Gary. “The emergence of mass spectrometry in biochemical research. ” Proc. Natl. Acad. Sci. 1994, 91, 11290 -11297. Roepstorff, P. ; Fohlman, J. Biomed. Mass Spectrom. 1994, 11, 601. 23

Ions 24

Ions 24

Peptide Fragmentation and Ionization 25

Peptide Fragmentation and Ionization 25

Tandem Mass Spectra Analysis y b 26 Gygi et al. Mol. Cell Bio. (1999)

Tandem Mass Spectra Analysis y b 26 Gygi et al. Mol. Cell Bio. (1999)

Mass Spectrum Interpretation Challenge • It is unknown whether an ion is a b-ion

Mass Spectrum Interpretation Challenge • It is unknown whether an ion is a b-ion or an y-ion or else. • Some ions are missing. • Each ion has multiple of isotopic forms. • Other ions (a or z) may appear. • Some ions may lose a water or an ammonia. • Noise. • Amino acid modifications. 27

A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry Chen

A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry Chen et al 2000. 11 th Annual ACM-SIAM Symp. of Discrete Algorithms pp. 389 -398. 28

SEQUEST: Sequence-Spectrum Correlation Given a raw tandem mass spectrum and a protein sequence database.

SEQUEST: Sequence-Spectrum Correlation Given a raw tandem mass spectrum and a protein sequence database. • For every protein in the database, • For every subsequence of this protein – – Construct a hypothetical tandem mass spectrum Overlap two spectra and compute the correlation coefficient (CC). • Report the proteins in the order of CC score. Eng, et al. 1994, Amer. Soc. for Mass Spect. 5: 976 -989 (Sequest) 29

Protein 2: Properties & Quantitation • Separation of proteins & peptides • Protein localization

Protein 2: Properties & Quantitation • Separation of proteins & peptides • Protein localization & complexes • Peptide identification (MS/MS) – Database searching & sequencing. • Protein quantitation – Absolute & relative • Protein modifications & crosslinking • Protein - metabolite quantitation 30

Expression quantitation methods RNA Protein Genes immobilized labeled RNAs immobilized labeled genes. Northern gel

Expression quantitation methods RNA Protein Genes immobilized labeled RNAs immobilized labeled genes. Northern gel blot QRT-PCR Reporter constructs Fluorescent In Situ (Hybridization) Tag counting (SAGE) Differential display Antibody arrays Westerns -nonesame (Antibodies) -nonemass spec 31

Molecules per cell E. coli/yeast Human Individual m. RNAs: 10 -1 to 103 10

Molecules per cell E. coli/yeast Human Individual m. RNAs: 10 -1 to 103 10 -4 to 105 Proteins: 10 to 106 10 -1 to 108 32

MS Protein quantitation R=. 84 Link, et al 33

MS Protein quantitation R=. 84 Link, et al 33

MS quantitation reproducibility Sample: Angiotensin, Neurotensin, Bradykinin Map: 600 – 700 m/z CV =

MS quantitation reproducibility Sample: Angiotensin, Neurotensin, Bradykinin Map: 600 – 700 m/z CV = s/m 34

Correlation between protein and m. RNA abundance in yeast Gygi et al. 1999, Mol.

Correlation between protein and m. RNA abundance in yeast Gygi et al. 1999, Mol. Cell Biol. 19: 1720 -30 (Pub) 35

Normality tests See Weiss 5 th ed. Page 920. Types of non-normality: kurtosis, skewness

Normality tests See Weiss 5 th ed. Page 920. Types of non-normality: kurtosis, skewness (www) (log) transformations to normal. Futcher et al 1999, A sampling of the yeast proteome. Mol. Cell. Biol. 19: 7357 -7368. (Pub) 36

Spearman correlation rank test rs = 1 - {6 S/(n 3 -n)} Rank (from

Spearman correlation rank test rs = 1 - {6 S/(n 3 -n)} Rank (from 1 to n, where n is the number of pairs of data) the numbers in each column. If there are ties within a column , then assign all the measurements that tie the same median rank. Note, avoids ties (which reduce the power of the test) by measuring with as fine a scale as possible. S= sum of the square differences in rank. (ref) X Y Rx Ry 1 8 1 4 6 2 3 1 6 3 3 2 n=4 6 4 3 3 37

Correlation of (phosphorimager 35 S met) protein & m. RNA rp = 0. 76

Correlation of (phosphorimager 35 S met) protein & m. RNA rp = 0. 76 for log(adjusted RNA) to log(protein) rs =. 74 overall; 0. 62 for the top 33 proteins & 0. 56 (not significantly different) for the bottom 33 proteins 38

Observed (Phosphorimage) protein levels vs. Codon Adaptation Index (CAI) Sharp and Li (1987); fi

Observed (Phosphorimage) protein levels vs. Codon Adaptation Index (CAI) Sharp and Li (1987); fi is the relative frequency of codon i in the coding sequence, and Wi the ratio of the frequency of codon i to the frequency of the major codon for the same amino-acid. ln(CAI)= S fi ln (Wi) i=1, 61 39

ICAT Strategy for Quantifying Differential Protein Expression. X= H or D Gygi et al.

ICAT Strategy for Quantifying Differential Protein Expression. X= H or D Gygi et al. Nature Biotechnology 40 (1999)

Mass Spectrum and Reconstructed Ion Chromatograms. Gygi et al. Nature Biotechnology 41 (1999)

Mass Spectrum and Reconstructed Ion Chromatograms. Gygi et al. Nature Biotechnology 41 (1999)

Protein & m. RNA Ratios +/- Galactose Ideker et al 422001

Protein & m. RNA Ratios +/- Galactose Ideker et al 422001

Protein 2: Properties & Quantitation • Separation of proteins & peptides • Protein localization

Protein 2: Properties & Quantitation • Separation of proteins & peptides • Protein localization & complexes • Peptide identification (MS/MS) – Database searching & sequencing. • Protein quantitation – Absolute & relative • Protein modifications & crosslinking • Protein - metabolite quantitation 43

Post-synthetic modifications • Radioisotopic labeling: PO 4 S, T, Y, H • Affinity selection:

Post-synthetic modifications • Radioisotopic labeling: PO 4 S, T, Y, H • Affinity selection: Cys: ICAT biotin-avidin selection PO 4: immobilized metal Ga(III) affinity chromatography(IMAC) Specific PO 4 Antibodies Lectins for carbohydrates • Mass spectrometry 44

32 P labeled phoshoproteomics Low abundance cell cycle proteins not detected above background from

32 P labeled phoshoproteomics Low abundance cell cycle proteins not detected above background from abundant proteins Futcher et al 1999, A sampling of the yeast proteome. Mol. Cell. Biol. 19: 7357 -7368. (Pub) 45

Natural crosslinks Disulfides Collagen Ubiquitin Fibrin Glycation Adeno primer proteins Cys-Cys Lys-Lys C-term-Lys Gln-Lys

Natural crosslinks Disulfides Collagen Ubiquitin Fibrin Glycation Adeno primer proteins Cys-Cys Lys-Lys C-term-Lys Gln-Lys Glucose-Lys d. CMP-Ser 46

Crosslinked peptide Matrixassisted laser desorption ionization Post. Source Decay (MALDI-PSDMS) tryptic digest of BS

Crosslinked peptide Matrixassisted laser desorption ionization Post. Source Decay (MALDI-PSDMS) tryptic digest of BS 3 cross-linked FGF -2. Cross-linked peptides are identified by using the program ASAP and are denoted with an asterisk (9). (B) MALDI-PSD spectrum of cross-linked peptide E 45 -R 60 (M + H+ = m/z 2059. 08). 47

Constraints for homology modeling based on MS crosslinking distances The 15 nonlocal throughspace distance

Constraints for homology modeling based on MS crosslinking distances The 15 nonlocal throughspace distance constraints generated by the chemical crosslinks (yellow dashed lines) superimposed on the average NMR structure of FGF-2 (1 BLA). The 14 lysines of FGF-2 are shown in red. Young et al 2000, PNAS 97: 5802 (Pub) 48

Homology modeling accuracy % sequence identity Swiss-model RMSD of the test set in Angstroms

Homology modeling accuracy % sequence identity Swiss-model RMSD of the test set in Angstroms 49

Top 20 threading models for FGF ranked by crosslinking constraint error 50

Top 20 threading models for FGF ranked by crosslinking constraint error 50

Protein 2: Properties & Quantitation • Separation of proteins & peptides • Protein localization

Protein 2: Properties & Quantitation • Separation of proteins & peptides • Protein localization & complexes • Peptide identification (MS/MS) – Database searching & sequencing. • Protein quantitation – Absolute & relative • Protein modifications & crosslinking • Protein - metabolite quantitation 51

Challenges for accurately measuring metabolites • Rapid kinetics • Rapid changes during isolation •

Challenges for accurately measuring metabolites • Rapid kinetics • Rapid changes during isolation • Idiosyncratic detection methods: enzyme-linked, GC, LC, NMR (albeit fewer molecular types than RNA& protein) 52

Databases Y= 598 have identical mass e. g. Ile & Leu = 131. 17

Databases Y= 598 have identical mass e. g. Ile & Leu = 131. 17 160 240 X = Mass Karp et al. (1998) NAR 26: 50. Eco. Cyc; Selkov, et al. (1997) NAR 25: 37. WIT Ogata et al. (1998) Biosystems 47: 119 -128 KEGG 53

Y= RP LC retention time in min. I L (higher hydrophocity) W X =

Y= RP LC retention time in min. I L (higher hydrophocity) W X = Mass 54

Wunschel J Chromatogr A 1997, 776: 205 -19 Quantitative analysis of neutral & acidic

Wunschel J Chromatogr A 1997, 776: 205 -19 Quantitative analysis of neutral & acidic sugars in whole bacterial cell hydrolysates using highperformance anion -exchange LC-ESIMS 2. (Pub) Metabolite fragmentation & stable isotope labeling 55

Isotopomers Klapa et al. Biotechnol Bioeng 1999; 62: 375. Metabolite and isotopomer balancing in

Isotopomers Klapa et al. Biotechnol Bioeng 1999; 62: 375. Metabolite and isotopomer balancing in the analysis of metabolic cycles: I. Theory. (Pub) "accounting for the contribution of all pathways to label distribution is required, especially. . . multiple turns of metabolic 56 cycles. . . 13 C (or 14 C) labeled substrates. "

Meta. Fo. R: Metabolic Flux Ratios Fractional 13 C labeling > Quantitative 2 D

Meta. Fo. R: Metabolic Flux Ratios Fractional 13 C labeling > Quantitative 2 D NMR Why use amino acids from proteins rather than metabolites directly? Sauer J et al. Bacteriol 1999; 181: 6679 -88 (Pub) Szyperski et al 1999 Metab. Eng. 1: 189. Dauner et al. 2001 Biotec Bioeng 76: 144 57

A functional genomics strategy that uses metabolome data to reveal the phenotype of silent

A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations -40 C Me. OH> 80 C Et. OH > Cobas Enzymatic Bio. Autoanalyser & Quantitative 1 H NMR 0 to 4. 4 ppm (1300 measures) Raamsdonk et al. 2001 Nature Biotech 19: 45. 58

Types of interaction models Quantum Electrodynamics Quantum mechanics Molecular mechanics Master equations subatomic electron

Types of interaction models Quantum Electrodynamics Quantum mechanics Molecular mechanics Master equations subatomic electron clouds spherical atoms (101 Pro 1) stochastic single molecules (Net 1) Phenomenological rates ODE Flux Balance Thermodynamic models Steady State Metabolic Control Analysis Spatially inhomogenous models Concentration & time (C, t) d. Cik/dt optima steady state (Net 1) d. Cik/dt = 0 k reversible reactions Sd. Cik/dt = 0 (sum k reactions) d(d. Cik/dt)/d. Cj (i = chem. species) d. Ci/dx Increasing scope, decreasing resolution 59

How do enzymes & substrates formally differ? E A EA ATP E EATP EB

How do enzymes & substrates formally differ? E A EA ATP E EATP EB B E 2+P ADP EP 60 Catalysts increase the rate (&specificity) without being consumed.

Enzyme rate equations with one Substrate & one Product E S P d. P/dt

Enzyme rate equations with one Substrate & one Product E S P d. P/dt = V (S/Ks - P/Kp) 1 + S/Ks + P/Kp S As P approaches 0: d. P/dt = V 1+ Ks/S 61

Enzyme Kinetic Expressions Phosphofructokinase Allosteric kinetic parameters for AMP, etc. 62

Enzyme Kinetic Expressions Phosphofructokinase Allosteric kinetic parameters for AMP, etc. 62

Human Red Blood Cell ODE model ADP ATP 1, 3 DPG NADH NAD 3

Human Red Blood Cell ODE model ADP ATP 1, 3 DPG NADH NAD 3 PG GA 3 P 2, 3 DPG FDP 2 PG DHAP PEP F 6 P ADP ATP R 5 P GA 3 P F 6 P PYR G 6 P GLCe ADP ATP GO 6 P GLCi ATP ADP 2 GSH GSSG X 5 P S 7 P ADOe INOe GA 3 P E 4 P F 6 P NADPH NADP ADO ADE ADP INO IMP ATP LACe Cl- HCO 3 - PRPP AMP PRPP HYPX LACi p. H AMP ATP ODE model NADH NAD RU 5 P NADPH ADP + K Na+ GL 6 P ATP R 1 P R 5 P ADEe Jamshidi et al. 2000 (Pub) 63

Red Blood Cell in Mathematica ODE model Jamshidi et al. 2000 (Pub) 64

Red Blood Cell in Mathematica ODE model Jamshidi et al. 2000 (Pub) 64

Protein 2: Properties & Quantitation • Separation of proteins & peptides • Protein localization

Protein 2: Properties & Quantitation • Separation of proteins & peptides • Protein localization & complexes • Peptide identification (MS/MS) – Database searching & sequencing. • Protein quantitation – Absolute & relative • Protein modifications & crosslinking • Protein - metabolite quantitation 65