Proteomics Jen Mona Krishna Introduction What is proteome

  • Slides: 42
Download presentation
Proteomics Jen, Mona & Krishna

Proteomics Jen, Mona & Krishna

Introduction �What is proteome? ü proteome is the entire complement of proteins, including the

Introduction �What is proteome? ü proteome is the entire complement of proteins, including the modifications made to a particular set of proteins, produced by an organism or system at particular time and conditions. ü varies with time and distinct requirements, or stresses, that a cell or organism undergoes.

What is proteomics? ü Proteomics is the large-scale study of proteins, particularly their functions

What is proteomics? ü Proteomics is the large-scale study of proteins, particularly their functions and structures. ü A short list of protein modifications that might be studied under proteomics include: • 1. 2. 3. 4. 5. 6. 7. phosphorylation ubiquitination methylation acetylation glycosylation oxidation Nitrosylation etc.

Why proteomics? • • 1. 2. 3. Gives better understanding of an organism than

Why proteomics? • • 1. 2. 3. Gives better understanding of an organism than Genomics. Limitations of genomics that made proteomics a better approach: the level of transcription of a gene gives only a rough estimate of its level of expression into a protein. many transcripts give rise to more than one protein, through alternative splicing or alternative post-translational modifications. many proteins form complexes with other proteins or RNA molecules, and only function in the presence of these other molecules.

4. proteins experience post-translational modifications that profoundly affect their activities. 5. protein degradation rate

4. proteins experience post-translational modifications that profoundly affect their activities. 5. protein degradation rate plays an important role in protein content. � Any cell may make different sets of proteins at different times, or under different conditions. Furthermore, any one protein can undergo a wide range of posttranslational modifications. So proteomics study can be complex. Therefore, proteomics is a better approach but complex.

Branches of proteomics �Proteomics analysis Determining proteins which are post-translationally modified �Expression proteomics Profiling

Branches of proteomics �Proteomics analysis Determining proteins which are post-translationally modified �Expression proteomics Profiling of expressed proteins using quantitative methods �Cell mapping proteomics Identification of protein complexes

Methods Gel based proteomics(2 DE): 1. ◦ ◦ older approach Separates proteins according to

Methods Gel based proteomics(2 DE): 1. ◦ ◦ older approach Separates proteins according to charge in the first dimension and according to the size in the second dimension. Commonly separated using polyacrylamide gel electrophorosis(PAGE). Identifies individual proteins in complex samples or multiple proteins in single sample.

2. Mass spectrometry based proteomics: ◦ Highly accurate for extremely low mass particles. ◦

2. Mass spectrometry based proteomics: ◦ Highly accurate for extremely low mass particles. ◦ Proteins are cleaved into peptides with enzymatic protease and the peptide masses are detected with the help of mass spectrometer(eg TOF) ◦ The mass spectrum of the peptides is obtained and it is converted to a list of peptide masses that is searched against the genome databases. ◦ Since, each protein has a unique peptide mass fingerprint, peptide masses can identify the protein in the database.

� 3. Protein ◦ ◦ ◦ arrays Idea is similar to c. DNA arrays.

� 3. Protein ◦ ◦ ◦ arrays Idea is similar to c. DNA arrays. Substrate is bound on the surface of array Sample is introduced, binding takes place Detection and analysis. Analysis of protein-protein, protein-DNA or protein. RNA interactions can be done.

Applications � Identification of potential new drugs for the treatment of diseases. This relies

Applications � Identification of potential new drugs for the treatment of diseases. This relies on genome and proteome information to identify proteins associated with a disease, which computer software can then use as targets for new drugs. � Biomarkers A number of techniques allow to test for proteins produced during a particular disease, which helps to diagnose the disease quickly .

Examples of biomarkers � Alzheimer's disease In Alzheimer’s disease, elevations in beta secretase create

Examples of biomarkers � Alzheimer's disease In Alzheimer’s disease, elevations in beta secretase create amyloid/beta-protein, targeting this enzyme decreases the amyloid/beta-protein and slows the progression of the disease � Heart disease Standard protein biomarkers for CVD include interleukin-6, interleukin-8, serum amyloid A protein, fibrinogen, and troponins.

BIOINFORMATICS & DATABASE TOOLS

BIOINFORMATICS & DATABASE TOOLS

Introduction – Current State �Many different informational protein databases available online �Most databases are

Introduction – Current State �Many different informational protein databases available online �Most databases are focused on protein identification ◦ Research community provides the data that drives the database contents ◦ Validation of Mass Spec data �Single vs. Multiple Species Support

Overview of Databases � � � � � NCBI – Protein / Peptidome Human

Overview of Databases � � � � � NCBI – Protein / Peptidome Human Gene and Protein Database (HGPD) Human Proteinpedia / Human Protein Reference Database (HPRD) Dynamic Proteomics Open Proteomics Database Global Proteome Machine Database Peptide Atlas Proteomics Identifications Database (PRIDE) Uni. Prot Knowledgebase

NCBI – Protein / Peptidome �Two databases contained in the Entrez suite �Multi-species result

NCBI – Protein / Peptidome �Two databases contained in the Entrez suite �Multi-species result sets �Protein ◦ Provides gene information pertaining to the expressed protein queried �Peptidome ◦ Mass Spec based protein identification database ◦ Experiment based result sets

Human Gene and Protein Database (HGPD) �Several c. DNA contributors, spanning the globe �Gateway

Human Gene and Protein Database (HGPD) �Several c. DNA contributors, spanning the globe �Gateway Expression System ◦ Allows for reproducible clone library. Clones are available for purchase. �Wheat Germ Cell-free protein synthesis ◦ Protein Expression portion of the database. Allows for visualization of the SDS-PAGE results.

Human Proteinpedia / Human Protein Reference Database (HPRD) � Modeled after wikipedia ◦ Users

Human Proteinpedia / Human Protein Reference Database (HPRD) � Modeled after wikipedia ◦ Users submit and edit the data in the database ◦ Differences �Original submitter expected to provide experimental evidence for the data �Only the original submitter can edit that specific data later. � Allows ◦ ◦ ◦ several protein features to be annotated Post-translational modification Tissue expression Cell line expression Subcellular localization Enzyme substrates Protein-protein interactions

Human Proteinpedia / Human Protein Reference Database (HPRD) �No visual protein expression data �Protein

Human Proteinpedia / Human Protein Reference Database (HPRD) �No visual protein expression data �Protein amino acid sequence given �Raw and processed mass spec files are available as experimental evidence �Provides links to the protein in other databases

Dynamic Proteomics � Different type of database, focusing on the dynamics of proteins treated

Dynamic Proteomics � Different type of database, focusing on the dynamics of proteins treated with an anti-cancer drug � Shows different uses for data repositories for proteomics ◦ Not just all-encompassing data source with generic data. ◦ Using simple databases and web front ends to make more specific types of data available to the community. � Also provides � Can compare links to other databases multiple sequences at once to search the c. DNA library.

Dynamic Proteomics Time lapse microscopy movies that illustrate the protein dynamics in individual living

Dynamic Proteomics Time lapse microscopy movies that illustrate the protein dynamics in individual living human cancer cells in response to an anti-cancer drug Time Lapse Video

Open Proteomics Database �University of Texas �Multi-species results �Smaller pool of data submitted for

Open Proteomics Database �University of Texas �Multi-species results �Smaller pool of data submitted for query

Global Proteome Machine Database �Private industry involvement �Mass Spec Validation �Protein Identification �Utilizes data

Global Proteome Machine Database �Private industry involvement �Mass Spec Validation �Protein Identification �Utilizes data from other databases ◦ Differs from the scheme of just linking to other protein databases

Peptide Atlas �Seattle Proteome Center �Focused on subset of human proteins ◦ Heart, Lung,

Peptide Atlas �Seattle Proteome Center �Focused on subset of human proteins ◦ Heart, Lung, Blood �Funded by NIH �Part of the Trans-Proteomic Pipeline software suite

Proteomics Identifications Database (PRIDE) �One of the earlier proteomic databases �European Bioinformatics Institute �Larger

Proteomics Identifications Database (PRIDE) �One of the earlier proteomic databases �European Bioinformatics Institute �Larger selection of species specific data �Java based, available for local deployment

Uni. Prot Knowledgebase �Swiss Institute of Bioinformatics �Also curated by European Bioinformatics Institute �Funded

Uni. Prot Knowledgebase �Swiss Institute of Bioinformatics �Also curated by European Bioinformatics Institute �Funded by NIH ◦ Forced the conversion of earlier nonpublic versions to become free and open

Overview of Tools �Ex. PAsy Proteomics Server �Trans-Proteomic Pipeline

Overview of Tools �Ex. PAsy Proteomics Server �Trans-Proteomic Pipeline

Ex. PAsy Proteomics Server �Swiss Institute of Bioinformatics tool suite �Protein ID by amino

Ex. PAsy Proteomics Server �Swiss Institute of Bioinformatics tool suite �Protein ID by amino acid sequence �Isoelectric Point Computation �Prediction of post translational modifications and amino acid substitutions. �Predicts protein cleavage sites �Protein identification by molecular weight

Trans-Proteomic Pipeline �Seattle Proteome Center

Trans-Proteomic Pipeline �Seattle Proteome Center

Challenges �Large number of data sources �Parallel efforts �Validation of Mass Spec data

Challenges �Large number of data sources �Parallel efforts �Validation of Mass Spec data

Future Considerations �Selection of a few ‘primary’ data repositories �Consolidation of multiple redundant efforts

Future Considerations �Selection of a few ‘primary’ data repositories �Consolidation of multiple redundant efforts being funded by the same agency ◦ Particularly NIH �Data standards to streamline the submission of results into multiple data sources. ◦ Reduction of the need to perform many searches to find information about a protein ◦ mz. XML is a start, but only covers mass spec data

Database References NCBI � ◦ ◦ Protein http: //www. ncbi. nlm. nih. gov/protein/ Peptidome

Database References NCBI � ◦ ◦ Protein http: //www. ncbi. nlm. nih. gov/protein/ Peptidome http: //www. ncbi. nlm. nih. gov/pepdome Human Gene and Protein Database (HGPD) � ◦ http: //riodb. ibase. aist. go. jp/hgpd/cgi-bin/index. cgi Human Proteinpedia � ◦ http: //www. humanproteinpedia. org/index_html Human Protein Reference Database (HPRD) � ◦ http: //www. hprd. org/ Dynamic Proteomics � ◦ http: //alon-serv. weizmann. ac. il/dynamprotb/seqsrch Open Proteomics Database � ◦ http: //bioinformatics. icmb. utexas. edu/OPD/ Global Proteome Machine Database � ◦ http: //thegpm. org Peptide Atlas � ◦ http: //www. peptideatlas. org/ Proteomics Identifications Database (PRIDE) � ◦ http: //www. ebi. ac. uk/pride/ Uni. Prot Knowledgebase � ◦ http: //www. uniprot. org/

Tool References �Ex. PAsy Proteomics Server ◦ http: //www. expasy. ch/ �Trans-Proteomic ◦ Pipeline

Tool References �Ex. PAsy Proteomics Server ◦ http: //www. expasy. ch/ �Trans-Proteomic ◦ Pipeline http: //tools. proteomecenter. org/wiki/inde x. php? title=Software: TPP

Applications of Proteomics Mona Motwani

Applications of Proteomics Mona Motwani

Discovery of protein biomarkers A biomarker can be defined as any laboratory measurement or

Discovery of protein biomarkers A biomarker can be defined as any laboratory measurement or physical sign used as a substitute for a clinically meaningful end point that measures directly how a patient feels, functions or survives as applied to proteomics, a biomarker is an identified protein(s) that is unique to a particular disease state. � Biomarkers of drug efficacy and toxicity are becoming a key need in the drug development process. � Mass spectral-based proteomic technologies are ideally suited for the discovery of protein biomarkers in the absence of any prior knowledge of quantitative changes in protein levels. � The success of any biomarker discovery effort will depend upon the quality of samples analysed, the ability to generate quantitative information on relative protein levels and the ability to readily interpret the data generated.

Study of Tumor Metastasis and Cancers � The identification of protein molecules with their

Study of Tumor Metastasis and Cancers � The identification of protein molecules with their expressions correlated to the metastatic process help to understand the metastatic mechanisms and thus facilitate the development of strategies for therapeutic interventions and clinical management of cancer. � Information contained within proteomic patterns has been demonstrated to detect ovarian, breast and prostate cancers with sensitivities and specificities greater than 90%.

Field of Neurotrauma � Neurotrauma results in complex alterations to the biological systems within

Field of Neurotrauma � Neurotrauma results in complex alterations to the biological systems within the nervous system, and these changes evolve over time. � Near-completion of the Human Genome Project has stimulated scientists to begin looking for the next step in unraveling normal and abnormal functions within biological systems. Consequently, there is new focus on the role of proteins in these processes. � Proteomics is a burgeoning field that may provide a valuable approach to evaluate the post-traumatic central nervous system (CNS). However the senstivity of the tissue and detection of potential biomarkers are major concern.

Renal disease diagnosis � Proteomics has also found significant application in studying the effects

Renal disease diagnosis � Proteomics has also found significant application in studying the effects of chemical insults on the kidney, particularly as a result of environmental toxins, drugs and other bioactive agents. � Combining classic analytical techniques as two-dimensional gel electrophoresis and more sophisticated techniques, such as MS, liquid chromatography has enabled considerable progress to be made in cataloguing and quantifying proteins present in urine and various kidney tissue compartments in both normal and diseased physiological states. � Critical developmental tasks that still need to be accomplished are completely defining the proteome in the various biological compartments (e. g. tissues, serum and urine) in both health and disease, which presents a major challenge given the dynamic range and complexity of such proteomes; and also achieving the routine ability to accurately and reproducibly quantify proteomic expression profiles and develop diagnostic platforms.

Neurology � In neurology and neuroscience, many applications of proteomics have involved neurotoxicology and

Neurology � In neurology and neuroscience, many applications of proteomics have involved neurotoxicology and neurometabolism, as well as in the determination of specific proteomic aspects of individual brain areas and body fluids in neurodegeneration. � Investigation of brain protein groups in neurodegeneration, such as enzymes, cytoskeleton proteins, chaperones, synaptosomal proteins and antioxidant proteins, is in progress as phenotype related proteomics. � The concomitant detection of several hundred proteins on a gel provides sufficiently comprehensive data to determine a pathophysiological protein network and its peripheral representatives. An additional advantage is that hitherto unknown proteins have been identified as brain proteins.

Autoantibody profiling � Proteomics technologies enable profiling of autoantibody responses using biological fluids derived

Autoantibody profiling � Proteomics technologies enable profiling of autoantibody responses using biological fluids derived from patients with autoimmune disease. � They provide a powerful tool to characterize autoreactive B-cell responses in diseases including rheumatoid arthritis, multiple sclerosis, autoimmune diabetes, and systemic lupus erythematosus. � Autoantibody profiling may serve purposes including classification of individual patients and subsets of patients based on their 'autoantibody fingerprint', examination of epitope spreading and antibody isotype usage, discovery and characterization of candidate autoantigens, and tailoring antigen-specific therapy.

Alzheimer's disease � In Alzheimer’s disease, elevations in beta secretase create amyloid/beta-protein, which causes

Alzheimer's disease � In Alzheimer’s disease, elevations in beta secretase create amyloid/beta-protein, which causes plaque to build up in the patient's brain, which is thought to play a role in dementia. � Targeting this enzyme decreases the amyloid/beta-protein and so slows the progression of the disease. � A procedure to test for the increase in amyloid/beta-protein is immunohistochemical staining, in which antibodies bind to specific antigens or biological tissue of amyloid/beta-protein.

Heart disease � Heart disease is commonly assessed using several key protein based biomarkers.

Heart disease � Heart disease is commonly assessed using several key protein based biomarkers. Standard protein biomarkers for CVD include interleukin-6, interleukin-8, serum amyloid A protein, fibrinogen, and troponins. � c. Tn. I cardiac troponin I increases in concentration within 3 to 12 hours of initial cardiac injury and can be found elevated days after an acute myocardial infarction. � A number of commercial antibody based assays as well as other methods are used in hospitals as primary tests for acute MI.

Future Challenges � There is a need for biomarkers with more accurate diagnostic capability,

Future Challenges � There is a need for biomarkers with more accurate diagnostic capability, particularly for early-stage disease. � Also adding a quality control sample on each chip array, and normalizing spectral data through commercially available or inhouse generated computer programs � Another challenge that proteomics techniques face lie largely in the application of bioinformatics, i. e. the spectral data management and analysis. The vast amount of spectral data generated demand implementation of advanced data management and analysis strategies. � Finally, the obvious challenge, as stated by many investigators, is the identification of the important proteins and peptides that contribute to the proteomic analysis.