Annotation of metabolites from gas chromatography/atmospheric pressure chemical ionization tandem mass spectrometry data using an in silico generated compound database and MetFrag

Rapid Commun Mass Spectrom. 2015 Aug 30;29(16):1521-9. doi: 10.1002/rcm.7244.

Abstract

Rationale: Gas chromatography (GC) coupled to atmospheric pressure chemical ionization quadrupole time-of-flight mass spectrometry (APCI-QTOFMS) is an emerging technology in metabolomics. Reference spectra for GC/APCI-MS/MS barely exist; therefore, in silico fragmentation approaches and structure databases are prerequisites for annotation. To expand the limited coverage of derivatised structures in structure databases, in silico derivatisation procedures are required.

Methods: A cheminformatics workflow has been developed for in silico derivatisation of compounds found in KEGG and PubChem, and validated on the Golm Metabolome Database (GMD). To demonstrate this workflow, these in silico generated databases were applied together with MetFrag to APCI-MS/MS spectra acquired from GC/APCI-MS/MS profiles of Arabidopsis thaliana and Solanum tuberosum. The Metabolite-Likeness of the original candidate structure was included as additional scoring term aiming at candidate structures of natural origin.

Results: The validation of our in silico derivatisation workflow on the GMD showed a true positive rate of 94%. MetFrag was applied to two datasets. In silico derivatisation of the KEGG and PubChem database served as a candidate source. For both datasets the Metabolite-Likeness score improved the identification performance. The derivatised data sources have been included into the MetFrag web application for the annotation of GC/APCI-MS/MS spectra.

Conclusions: We demonstrated that MetFrag can support the identification of components from GC/APCI-MS/MS profiles, especially in the (common) case where reference spectra are not available. This workflow can be easily adapted to other types of derivatisation and is freely accessible together with the generated structure databases.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Data Curation / methods*
  • Databases, Chemical*
  • Gas Chromatography-Mass Spectrometry / methods*
  • Internet
  • Models, Chemical
  • Plant Extracts / analysis
  • Plant Extracts / chemistry
  • Reproducibility of Results
  • Software*
  • Tandem Mass Spectrometry / methods*

Substances

  • Plant Extracts