Recent Trends in Computational Proteomics Dr G P
- Slides: 51
Recent Trends in Computational Proteomics Dr G P S Raghava Institute of Microbial Technology, Chandigarh, India Bioinformatics Informatics | Drug Informatics | Chemoinformatics | Vaccine Email: raghava@imtech. res. in http: //crdd. osdd. net/ http: //www. imtech. res. in/raghava/
Studying Central Dogma with “Omis” Transcriptomics Protein Mature m. RNA Genomics DNA Metabolomics (all the endogenous metabolites produced)
Complexity of the system
NEXT GENERATION SEQUENCING • Sequence full genome of an organism in a few days at a very low cost. • Produce high throughput data in form of short reads. Illumina ABI’s Solid Roche’s 454 FLX Ion torrent
Genome assembly and annotation done at IMTECH • Burkholderia sp. SJ 98 (Kumar et al. 2012). • Debaryomyces hansenii MTCC 234 (Kumar et al. 2012). • Imtechella halotolerans K 1 T (Kumar et al. 2012). • Marinilabilia salmonicolor JCM 21150 T (Kumar et al. 2012). • Rhodococcus imtechensis sp. RKJ 300 (Vikram et al. 2012). • Rhodosporidium toruloides MTCC 457 (Kumar et al. 2012).
RNA sequencing RNA-Seq is a recently developed approach to transcriptome profiling that uses deepsequencing technologies to measure levels of transcripts and their isoforms. Samples of interest Condition (colon tumor) Isolate RNAs Generate c. DNA, fragment, size select, add linkers Sequence ends Map to genome, transcriptome, and predicted exon junctions Downstream analysis 100 s of millions of paired reads 10 s of billions bases of sequence
Methods for Protein Analysis Gel electrophoresis, northern/western blot (fluorescence/radi o active label) X-ray crystallography Protein microarrays Mass Spectrometry
Protein arrays High throughput analysis of hundreds of thousands of proteins. Proteins are immobilized on glass chip. Various probes (protein, lipids, DNA, peptides, etc) are used. Cons Require a priori knowledge of the proteins of interest. Availability of suitable antibodies. Measure only a small fraction of the proteome
Mass Spectrometry Find a way to “charge” an atom or molecule (ionization). Place charged atom or molecule in a magnetic field or electric field and measure its speed or radius of curvature relative to its mass-to-charge ratio (mass analyzer). Detect ions using microchannel plate or photomultiplier tube(Detection). Sample Ion source: makes ions Mass analyzer: separates ions Mass spectrum Presents information
Protein Identification by MS Spot removed Library from gel Theoretical spectra built Fragmented using trypsin Spectrum of fragments generated MATCH Experimental spectra Artificially trypsinated Database of sequences (i. e. Swiss. Prot)
Instrumentation High Vacuum System Inlet • • • HPLC Flow injection Sample plate • • Ion Source Mass Analyzer MALDI ESI Ø Time of flight (TOF) Ø Quadrupole Ø Ion Trap Ø Magnetic Sector Ø FTMS Detector • • • Data System Microchannel Plate Electron Multiplier Hybrid with photomultiplier
Ion Sources make ions from sample molecules (Ions are easier to detect than neutral molecules. ) MALDI Electrospray ionization Sample plate Sample Inlet Nozzle Pressure = 1 atm Inner tube diam. = 100 um(Lower Voltage) Partial vacuum Laser MH+ N 2 Sample in solution N 2 gas ++++ ++ ++ +++ + + + +++ MH 2+ MH 3+ High voltage applied to metal sheath (~4 k. V) Charged droplets Grid (0 V) +/- 20 k. V
Mass analyzers separate ions based on their mass-to-charge ratio (m/z) Operate under high vacuum (keeps ions from bumping into gas molecules) Actually measure mass-to-charge ratio of ions (m/z) Key specifications are resolution, mass measurement accuracy, and sensitivity. Several kinds exist: for bioanalysis, quadrupole, time-of-flight and ion traps are most used.
Tandem Mass Spectrometry MS LC MS-1 collision cell MS-2 Ion Source Parent Ions Fragment Ions MS/MS
What’s in a Mass Spectrum? Ion Abundance (as a %of Base peak) Fragment Ions Derived from molecular ion or higher weight fragments “molecular ion” In molecular ions, adduct ions, [M+reagent gas]+ High mass Mass, as m/z. Z is the charge, and for doubly charged ions (often seen in macromolecules), masses show up at half their proper value
Peptide Fragmentation H. . . -HN-CH-CO-NH-CH-CO-…OH Ri-1 Ri Ri+1 N-terminus C-terminus AA residuei-1 AA residuei+1 Collision Induced Dissociation H+ H. . . -HN-CH-CO Ri-1 Prefix Fragment . . . NH-CH-CO-…OH Ri Ri+1 Suffix Fragment • Peptides tend to fragment along the backbone. • Fragments can also loose neutral chemical groups like NH 3 and H 2 O.
B ions and Y ions http: //www. molgen. mpg. de/101151/Proteomics
Mass spectra searching techniques
Commercial Software SEQUEST (Yates et al. , 1995) MASCOT (Perkins, Pappin, Creasy, & Cottrell, 1999) Open Database search tools Myrimatch X!Tandem MSGF OMSSA More accurate than Mascot and sequest (Kim & Pevzner, 2014)
Target-Decoy Search Strategy for Mass Spectrometry-Based Proteomics • incorrect “decoy” sequences added to the search space will correspond with incorrect search results that might otherwise be deemed to be correct. Mass spectrum Target and reversed Decoy database Proportion of matches in decoy database represent false matches
Applications • Analyzing Protein Modifications • Finding all modifications on a single protein • Proteome wide scanning of modifications • Protein Profiling • Generate large scale proteome maps • Annotate and correct genomic sequences • Analyze protein expression as a function of cellular state • Detection of amino acid substitutions • Protein sample identification/confirmation • Protein sample purity determination
Major Proteomics Repositories
Major Challenge Large number of unidentified spectra May be peptides are missing in the database searched…. . Are all the reference databases complete ? ? ?
Proteogenomics • • Term coined in the literature in 2004. Genomic for generating customized databases. Identify novel peptides. Disease biomarkes based on novel mutation
intensity Proteomics Non-Tumor Sample intensity mass/charge Searched against Tumor Sample Refseq Uniprot Aberrant proteins Variant proteins Fusion proteins mass/charge Tumor Sample Genome sequencing Germline variants mass/charge intensity Non. Tumor Sample intensity Proteogenomics RNA-Seq mass/charge Alternative splicing, somatic variants, expression Tumor Specific Protein DB
TYPES OF PEPTIDES IDENTIFIED IN PROTEOGENOMICS Novel Peptides mapping on Non Coding Region 5’UTR Novel Peptides mapping on Non Coding Region 3’UTR Novel Junctions Intergenic peptides Alternating Reading Frame
Methods of generation of customized databases 6 Frame Translation of Reference Database Ab initio gene prediction. RNA-seq data Whole genome Sequencing Other Databases • Perl or python scripts • Perl and python scripts • Custom. Prodb, Galaxy-P system, sap. Finder • Peppy • OMIM, ne. Xt. Prot, Ecgene, Chimer. DB, COSMIC
What does proteogenomics offer? Proteomics Genomics Transcriptomics Novel N termini Novel signal peptide Novel Exons Novel Junctions Variant peptides Novel ORFs Novel signal peptide cleavage sites
Open source Web Services
Chemoinformatics and Pharmacoinformatics
Molecular Interactions
Biological Databases
Genome Annotations
Immunoinformatics or Vaccine Informatics
Functional Annotation of Proteins
Proteins Structure Prediction
GPSR: A Resource for Genomics Proteomics and Systems Biology • A journey from simple computer programs to drug/vaccine informatics • Limitations of existing web services • History repeats (Web to Standalone) • Graphics vs command mode • General purpose programs • Small programs as building unit • Integration of methods in GPSR
Types of Prediction Methods
Customized operating environment for drug discovery pipeline BIOINFORMATICS Live Server VACCINE INFORMATICS Live CD CHEMINFORMATICS Installation Pkg Repository Webserver Standalone Galaxy platform All in ONE
Osddlinux desktop is ready for use. Password for sudo : osddlinux root : osddlinux Osddlinux installation on system hard drive
Operating System for Drug Discovery
- Modern trends in project management
- Emerging trends in mis
- Recent trends in ic engine
- Explain the recent trends of india's foreign trade
- Computational creativity market trends
- History of proteomics
- Comparative proteomics kit ii western blot module
- Seismic analysis code
- Carmelego
- Comparative proteomics kit ii western blot module
- Comparative proteomics kit ii western blot module
- Comparative proteomics kit ii western blot module
- Recent advances in dental ceramics
- Recent developments in object detection
- Comait
- During a recent police investigation
- News using passive voice
- Conclusion of skimming and scanning
- Ap synthesis prompt
- After miguel's recent automobile accident
- Https drive google com drive u 0 recent
- Recent demographic changes in the uk
- A friend emails you the results of a recent high school
- Recent developments in ict
- Udin generate icsi
- Myips powerschool student login
- Cs 514 purdue
- If time permits quotes
- Computational security
- Computational reflection
- Basic computational models in computer architecture
- Sano centre for computational medicine
- Computational sustainability subjects
- Computational speed
- "computational thinking"
- Abstraction computer science gcse
- Computational lexical semantics
- Sp computational formula
- Computational biology: genomes, networks, evolution
- Computational chemistry branches
- Computational thinking jeannette wing
- Computational radiology
- Computational fluid dynamics
- Computational intelligence tutorial
- Computational fluid dynamics
- Cmu computational biology
- Fundamentals of computational neuroscience
- Using mathematics and computational thinking
- Standard deviation computational formula
- Nibib.nih.gov computational
- Computational diagnostics
- Computational philology