High Throughput and Large Scale Proteomics Analysis Austin

  • Slides: 35
Download presentation
High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph. D. Department of Pharmaceutical

High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph. D. Department of Pharmaceutical Sciences, University of Southern California

Overview 1. Shotgun proteomics and ESI mass spectrometry 2. Proteomic data mining 3. and

Overview 1. Shotgun proteomics and ESI mass spectrometry 2. Proteomic data mining 3. and data visualization

12, 000 proteins

12, 000 proteins

Are We Ready for Mammalian Proteomics ? Shotgun Proteomics 2 -D Gel Cytoskelatal Proteins

Are We Ready for Mammalian Proteomics ? Shotgun Proteomics 2 -D Gel Cytoskelatal Proteins m. M, 1 x 109 copies/cell Metabolism 0. 1 m. M, 1 x 108 Ribosomes 10 m. M, 1 x 107 Kinases 1 m. M, 1 x 106 Cyclins 0. 1 m. M, 1 x 105 Transcription factors 10 n. M, 1 x 104 Synaptic Markers 0. 1 n. M, 1 x 103

Advantages of Proteomics Using LC-MS/MS • No pre-selection of biased targets (hypothesis-free, open approach)

Advantages of Proteomics Using LC-MS/MS • No pre-selection of biased targets (hypothesis-free, open approach) • Protein variants are detected simultaneously • Protein isolation and detection are on a small scale (~ 10 fmol from complex mixtures – subcellular fractions, whole cells, or tissue) • Obtain sequence information of peptides (not just masses) and can sequence ~4, 000 proteins in a single experiment

Liquid Chromatography Quadrupole Ion Trap Tandem Mass Spectrometer

Liquid Chromatography Quadrupole Ion Trap Tandem Mass Spectrometer

Electrospray vs Nanospray

Electrospray vs Nanospray

Splitless Nano-Liquid Chromatography

Splitless Nano-Liquid Chromatography

Five Independent Loop Injections

Five Independent Loop Injections

10 -cycle Mud. PIT Analysis SCX (NH 4 OAc) RF#1 RF#2 100 m. M

10 -cycle Mud. PIT Analysis SCX (NH 4 OAc) RF#1 RF#2 100 m. M MSM Wash MSM MSS 600 m. M Wash 700 m. M MSM 800 m. M Wash 900 m. M MS 1000 m. M Wash MSM Wash MSS Wash MS 200 m. M 300 m. M 400 m. M 500 m. M

Multidimensional Protein Identification Technology (Mud. PIT) Digested protein complexes 0 -500 m. M NH

Multidimensional Protein Identification Technology (Mud. PIT) Digested protein complexes 0 -500 m. M NH 4 OAc SCX Column RP #1 500 400 300 200 100 RP #2 400 300 200 1, 000 -2, 000 Sequencing Attempts in 60 Minutes 20, 000 MS/MS spectra/day

Isotope-Coded Affinity Tags (ICAT)

Isotope-Coded Affinity Tags (ICAT)

Electrospray Ionization (ESI) Ions in gaseous phase Ions in solution LC Spray tip Ion

Electrospray Ionization (ESI) Ions in gaseous phase Ions in solution LC Spray tip Ion source opening for the MS

Theoretical CID of a Tryptic Peptide + + F L G K + K

Theoretical CID of a Tryptic Peptide + + F L G K + K b 3 y 1 + + CID G K b 2 + F L G K y 2 + + F L G K b 1 + F L G K y 3 Non-dissociated Parent ions Daughter ions y 1 + F L G K y 3 b 1 Relative Intensity Parent ions + MS/MS Spectrum y 2 b 2 K G L F L G F b 3 K m/z (464. 29)

Sequest. Queue (6, 000 dta x 50 = 300, 000 ms/ms scans)

Sequest. Queue (6, 000 dta x 50 = 300, 000 ms/ms scans)

Data Mining through SEQUEST and PAULA Database Search Time • Yeast ORFs (6, 351

Data Mining through SEQUEST and PAULA Database Search Time • Yeast ORFs (6, 351 entries) 52 sec: 0. 104 sec/s • Non-redundant protein (100 k entries) 3500 min: • EST (100 K entries, 3 -frames) 5 -10, 000 min:

SEQUEST Algorithm Step 1. Determine Parent STEP 1. Ion molecular Step 2. Theoretical MS/MS

SEQUEST Algorithm Step 1. Determine Parent STEP 1. Ion molecular Step 2. Theoretical MS/MS spectra SEQ 1 mass SEQ 2 SEQ 3 (Experimental MS/MS Spectrum) SEQ 4 500 peptides with masses closest to that of the parent ion are retrieved from a protein database. Computer generates a theoretical MS/MS Spectrum for each peptide sequence (SEQ 1, 2, 3, 4, …) ZSA-charge assignment Step 4. Scores are ranked and Protein Identifications are made based on these cross correlation scores. Step 3. STEP 3. Experimental Spectrum is compared with each theoretical spectra and correlation scores are assigned. Unified Scoring Function (Experimental MS/MS Spectrum)

One spectrum TWO protein identifications Spectrum A was used to search against NCBI human

One spectrum TWO protein identifications Spectrum A was used to search against NCBI human database: Macrophage inhibitory factor was identified Mol Cell Proteomics. 2003 Jul; 2(7): 428 -42. Same spectrum was used to search against non-redundant database. Bovine G-protein gamma was identified. Since the primary amino acid sequence of human G-protein gamma is almost identical to bovine, this protein was later identified as human G-protein Gamma. The initial false ID was due to an entry missing of human g-protein in human database. The sequence was later reentered Into the human database and the third search yielded correct ID. Fragment ions match both sequences are indicated by * Spectrum B has two additional ions matched to G-protein gamma

Distribution of Xcorr from correctly and incorrectly identified peptides

Distribution of Xcorr from correctly and incorrectly identified peptides

X-correlation vs Peptide length

X-correlation vs Peptide length

Distribution of Xcorr vs Charge State

Distribution of Xcorr vs Charge State

F-score and probability-based peptide assignment

F-score and probability-based peptide assignment

Identification of modified LRP in APP/PS 1 Transgenic Mice

Identification of modified LRP in APP/PS 1 Transgenic Mice

Neurotransmitter Receptors Tg Peptide A) 1. (Q 9 WV 18) Gamma-aminobutyric acid type B

Neurotransmitter Receptors Tg Peptide A) 1. (Q 9 WV 18) Gamma-aminobutyric acid type B receptor, subunit 1 precursor (GABA-B-R 1) 2. (NP_032102. 1) gamma-aminobutyric acid (GABA-A) receptor, subunit rho 2 3. (NP_034382. 1) gamma-aminobutyric acid A receptor, gamma 1 4. (NP_033733. 1) cholinergic receptor, nicotinic, epsilon polypeptide; acetylcholine receptor 5. (NP_150372. 1) cholinergic receptor, muscarinic 3, cardiac; ACh. R M 3 6. (S 28058) serotonin receptor 5 7. (NP_031903. 1) dopamine receptor 3; D 3 receptor 8. (Q 60934) Glutamate receptor, ionotropic kainate 1 precursor (Glutamate receptor 5) 9. (I 49696) glutamate receptor chain B (version flip) B) 1. (NP_038589. 1) 5 -hydroxytryptamine (serotonin) receptor 3 A 2. (P 30545) Alpha-2 B adrenergic receptor (Alpha-2 B adrenoceptor) 3. (NP_032195. 1) glutamate receptor, ionotropic, NMDA 1 (zeta 1) 4. (NP_032198. 1) glutamate receptor, ionotropic, NMDA 2 D (epsilon 4); Glu. Repsilon 4 5. (I 49696) glutamate receptor chain B (version flip) C) 1 2. (NP_034428. 1) glycine receptor, beta subunit (JC 4262) glutamate transporter 2

Proteomic Data Visualization and Future Directions • information overload • data integration • ease

Proteomic Data Visualization and Future Directions • information overload • data integration • ease of visualization

Network for NMDA and glutamate receptors

Network for NMDA and glutamate receptors

Network for NMDA and glutamate receptors (Zoom-in)

Network for NMDA and glutamate receptors (Zoom-in)

Scoring Algorithm for Spectral Analysis SEQUEST Raw Unidentified Spectra (~10, 000 -100, 000) SALSA

Scoring Algorithm for Spectral Analysis SEQUEST Raw Unidentified Spectra (~10, 000 -100, 000) SALSA Identified Sequence

SALSA Overview * product ion chargedloss neutral loss Mass difference A GD W T

SALSA Overview * product ion chargedloss neutral loss Mass difference A GD W T ion series • SALSA is a tool for identifying MS-MS spectra in Xcalibur analysis files that display specific user-defined characteristics. Because these characteristics correspond to structural features of a peptide, SALSA allows the user to selectively locate MS-MS spectra of specific peptides or their variant or modified forms.

Construction of SALSA ruler GAIIGLMGGVV GAIIGLMGGVV GAIIGLM GAIIGL GAIIG GAII GA GAI Methionine Oxidation

Construction of SALSA ruler GAIIGLMGGVV GAIIGLMGGVV GAIIGLM GAIIGL GAIIG GAII GA GAI Methionine Oxidation 16 amu (one oxygen atom) m/z GAIIGLMGGV GAIIGLMGG GAIIGLM GAIIGL GAIIG GAII GA GAIIGLMGGVV

Absolute Quantification Analysis Quantification of Methionine Oxidation GAIIGLMVGGVV: +7 amu

Absolute Quantification Analysis Quantification of Methionine Oxidation GAIIGLMVGGVV: +7 amu