Welcome Mass Spectrometry meets Cheminformatics WCMC Metabolomics Course
Welcome! Mass Spectrometry meets Cheminformatics WCMC Metabolomics Course 2013 Tobias Kind Course 6: Concepts for LC-MS http: //fiehnlab. ucdavis. edu/staff/kind 1 CC-BY License
General LC-MS data processing for small molecules Confirm with MS/MS or MSn fragmentation 100 50 MS 2 0 200 400 600 2
Deconvolution and evaluation of LC-MS data LC-MS Chromatogram Picture: nrel. gov LC-MS run 40 minutes C 8 column, Agilent-UPLC Chlamydomonas extract + Extracted Mass spectrum FT-ICR-MS mass spectrum MS 1 @ 50, 000 resolving power Check charge state = 1 756. 57 represents [M+H]+ 3 Chromatogram Source: N. Saad, DY Lee Fiehn. Lab
Peak Picking with ACD/Spec. Manager 9. 0 4
Processing of LC-MS data - use of Mass. Frontier 5
Deconvolution and evaluation of LC-MS data Example with High. Chem Mass Frontier LC-MS detected compound Marked with blue triangle 141 peaks extracted Extracted MS 1 peak Library search useless (only single peak) 6
Adduct removal and detection during ESI-LC-MS runs [M+H]+ = 756. 577 M = 755. 5627 Download Adduct-Calculator [LINK] 7
Problem: Detection of molecular ion Problem: Is this the pure mass spectrum or from overlapping peaks? Is it M+H or M+Na or any of the 40 other adducts? Example data from crocin standard mixture (expected MW: 976. 965) [M+Na]+ CID = 0 V CID = 100 V [M+Na]+ 8
Adduct removal and detection during LC-MS runs Adduct can be removed before or after formula generation. For good isotopic pattern matching remove adduct after formula generation. 7 Golden Rules apply LEWIS and SENIOR check (adduct needs to be removed) 9
Formula Generation from accurate mass measurement CHN 5 O 13 P 15 or CHN 11 P 19 ? ? ? Apply Seven Golden Rules for correct molecular formulas Apply heuristic and mathematical and chemometric rules for filtering elemental compositions 10
Isotopic Pattern generator from formula Example from MWTWIN Is usually included in every LC-MS software 11
Isotopic pattern equally important as accurate mass Experimental result Abundances for all molecular formulae A+1 = 47. 44% A+2 = 10. 92 % We can discard all other results outside the error box. Current box reflect +/- 10% error. 12
Problems during LC-MS peak detection and MS deconvolution 48 Multiple peaks detected But it is a single component only 36 24 12 0 10. 80 10. 90 11. 00 11. 10 11. 20 11. 30 11. 40 11. 50 11. 60 11. 70 11. 80 11. 90 Multiple peaks detected – Solution: adjust deconvolution settings Mass spectra not clean – Solution: manual peak extraction Not enough peaks detected – Solution: increase signal noise (S/N) settings Finding optimum settings is: • non-trivial and can change in different matrices • can be evaluated on standards and quality check mixtures • can be obtained by self-sharpening algorithms 13
UPLC-FT-MS data extraction with Mass. Frontier Mass Frontier 5. 0 Report File MS 1 FTMS + p NSI Full #1414 MS 1 100 Fragment peaks m/z 478. 45 and 496. 46 756. 577 50 0 755 100 760 478. 45 496. 46 75 MS 2 File MS 2 ITMS + c NSI d Full 756. 58 #1415 100 MS 2 50 478. 45 496. 46 50 0 25 335. 56 200 400 600 0 236. 20 271. 06 250 300 335. 56 350 391. 34 434. 52 400 450 500 550 600 650 687. 72 700 100 756. 577 75 MS 1 50 25 0 751 752 753 754 755 756 757 758 759 760 761 762 Approach: generate molecular formula using Seven Golden Rules; find matching isomers in molecular databases; confirm possible matches by in-silico fragmentation (usually impossible); 14
Seven Golden Rules – generate possible molecular formulas 5 formula candidates left with 30 ppm mass accuracy and 10% isotopic abundances These are candidates with good isotopic pattern match. These 5 were found in Pub. Chem. C 42 H 78 NO 8 P - 1 isomer hit C 42 H 77 NO 10 - 1 isomer hit C 39 H 73 N 5 O 9 - 0 isomer hit C 43 H 82 NO 7 P - 2 isomers found C 43 H 73 N 5 O 6 -2 isomers found C 45 H 77 N 3 O 6 - 1 hit found C 45 H 69 N 7 O 3 - 1 hit found Scan speed problem: Due to poor ion statistics only few scans are collected Mass accuracy and isotopic abundance accuracy are bad 15
Structural isomer lookup example in Chem. Spider 16
In-silico fragmentation with Mass. Frontier using fragmentation library of 20, 000 mechanisms from literature 17
In-silico fragmentation with Mass. Frontier Experimental peaks m/z 478. 45 and 496. 46 were detected in MS/MS spectrum In-silico fragmentation should match the experimental fragmentation. In-silico - using a computer library of 20, 000 fragmentation rules from the MS literature C 42 H 78 NO 8 P – Fragments N. A. C 42 H 77 NO 10 – Fragments N. A. C 43 H 82 NO 7 P – Fragments m/z 478 and 496 LWHKEEJOPDYARM-WYCAKVPNBV Possible solution (2 fragments match) C 45 H 69 N 7 O 3 – Fragments: m/z 496 C 43 H 82 NO 7 PH – Fragments N. A. 18
Discussion of general LC-MS approach Example discussion: Fragment at m/z 236 not explained; molecular ion may be wrong; Substance can be potential new compound or not in database Must be confirmed by NMR or external standard General problems: Best approach is to generate MS and MSn and MSe mass spectral libraries Adduct removal is a problem Building target lists is always good (know what to expect) Focus on certain substance classes only Focus on single compound only Substance must be known for in-silico approach Fragmentation rules must be captured for in-silico approach In-silico approaches work best for peptides, carbohydrates, lipids (due to known and stable fragmentations) Importance: Taxonomics species-compounds relationship databases or pathway DBs KNAp. SAc. K/ database, KEGG, Meta. Cyc 19
All theory is lost – if compound is truly unknown or not in public database Rank 147 in 7 GR with original settings C 46 H 77 NO 7 not found in Chem. Spider, Pub. Chem, Lipid. Maps (2013) Only found in internal Lipid. Blast database. UPLC-FTICR settings readjusted to 3 ppm mass accuracy and 5% isotopic abundance error Actual mass error is: -1. 718266 ppm Max. isotopic abundance error: 4. 10 0% Additional evidence, there is not PC in Chlamydomonas, but mostly the betaine lipid DGTS Rank 18 in 7 GR with original settings (from 56) DGTS accounts for about 10% of total lipid in Chlamydomonas; Peter Schlapfer, Waldemar Eichenberger Plant Science Letters Volume 32, Issues 1 -2, October-November 1983, Pages 243 -252 and FEBS Letters, Volume 88, Issue 2, 15 April 1978, Pages 201– 204; W. Eichenberger , A. Boschetti 20
Lipid. Blast comes to help unravel the mystery best tentative structure proposal so far Experimental MS/MS m/z 236 instead of m/z 184 DGTS instead of PC Precursor m/z 756. 576500 (M+H)+ Lipid. Blast in-silico MS/MS Name: DGTS 36: 6; [M+H]+; DGTS(18: 3/18: 3) MW: 756 Exact Mass: 756. 57782 ID#: 9031 DB: dgts+hpos-it. msp Comment: Parent=756. 57782 Mz_exact=756. 57782 ; DGTS 36: 6; [M+H]+; DGTS(18: 3/18: 3); C 46 H 77 NO 7 7 m/z Values and Intensities: 236. 14979 200. 00 478. 35339 600. 00 496. 36395 600. 00 625. 48320 400. 00 639. 49885 250. 00 738. 56726 999. 00 756. 57782 10. 00 C 10 H 21 NO 5 H+ (236. 14) [M+H]-sn 1 -H 2 O || [M+H]-sn 2 -H 2 O [M+H]-sn 1 || [M+H]-sn 2 [M+H]-131 [M+H]-117 [M-H 2 O]+ [M+H] Betaine lipid DGTS(18: 3/18: 3) Kind T, Liu KH, Lee do Y, Defelice B, Meissen JK, Fiehn O. Nature Methods. 2013 Aug; 10(8): 755 -8. doi: 10. 1038/nmeth. 2551. Epub 2013 Jun 30. 21 Lipid. Blast in silico tandem mass spectrometry database for lipid identification.
The Last Page - What is important to remember: Always use peak picking and mass spectral deconvolution for LC-MS data Apply accurate mass, accurate isotopic abundances together formula generation Make use of high resolving power whenever possible Use MS/MS data and mass spectra from different ionization voltages Use existing MS/MS libraries or create your own MSn tree libraries Use molecular isomer databases for obtaining possible structure candidates Confirm if possible with MSn data or other possible filter constraints Validation, validation: Pipelines and workflows must be validated with unknown (unknown) compounds Compound is tentative until finally approved with reference standard or matching with multiple orthogonal parameters such as MS/MS, retention time or NMR. 22
Recent literature (2012/2013) Applying Tandem Mass Spectral Libraries for Solving the Critical Assessment of Small Molecule Identification (CASMI) LC/MS Challenge 2012; Oberacher H; Metabolites 2013, 3(2), 312 -324; doi: 10. 3390/metabo 3020312 Computational mass spectrometry for small molecules; J Cheminform. 2013; 5: 12. ; Kerstin Scheubert, Franziska Hufsky, Sebastian Böcker Published online 2013 March 1. doi: 10. 1186/1758 -2946 -5 -12 A Rough Guide to Metabolite Identification Using High Resolution Liquid Chromatography Mass Spectrometry in Metabolomic Profiling in Metazoans; Computational and Structural Biotechnology Journal ; David Watson; Volume No: 4, Issue: 5, January 2013, e 201301005, http: //dx. doi. org/10. 5936/csbj. 201301005 Bioinformatics. 2011 Apr 15; 27(8): 1108 -12. doi: 10. 1093/bioinformatics/btr 079. Epub 2011 Feb 16. Brown M, Wedge DC, Goodacre R, Kell DB, Baker PN, Kenny LC, Mamas MA, Neyses L, Dunn WB. Automated workflows for accurate mass-based putative metabolite identification in LC/MS-derived metabolomic datasets. {PUTMEDID_LCMS} IDEOM: an Excel interface for analysis of LC–MS-based metabolomics data; Bioinformatics. 2012 Apr 1; 28(7): 1048 -9. doi: 10. 1093/bioinformatics/bts 069. Epub 2012 Feb 4. IDEOM: an Excel interface for analysis of LC-MS-based metabolomics data. Creek DJ, Jankevics A, Burgess KE, Breitling R, Barrett MP. Anal Chem. 2012 Jan 3; 84(1): 283 -9. doi: 10. 1021/ac 202450 g. Epub 2011 Dec 12. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. Kuhl C, Tautenhahn R, Böttcher C, Larson TR, Neumann S. Mass appeal: metabolite identification in mass spectrometry-focused untargeted metabolomics; March 2013, Volume 9, Issue 1 Supplement, pp 44 -66; Warwick B. Dunn, Alexander Erban, Ralf J. M. Weber, Darren J. Creek, Marie Brown, Rainer Breitling, Thomas Hankemeier, Royston Goodacre, Steffen Neumann, Joachim Kopka, Mark R. Viant Metabolite profiling and beyond: approaches for the rapid processing and annotation of human blood serum mass spectrometry data; Metabolomics and Metabolite Profiling Analytical and Bioanalytical Chemistry June 2013, Volume 405, Issue 15, pp 5037 -5048 [LINK] Anal Chem. 2013 Apr 2; 85(7): 3576 -83. doi: 10. 1021/ac 303218 u. Epub 2013 Mar 21. Automated pipeline for de novo metabolite identification using mass-spectrometry-based metabolomics. Peironcely JE, Rojas-Chertó M, Tas A, Vreeken R, Reijmers T, Coulier L, Hankemeier T. http: //fiehnlab. ucdavis. edu/staff/kind/Metabolomics/Structure_Elucidation/ 23
- Slides: 23