MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics

Nat Methods. 2017 May;14(5):513-520. doi: 10.1038/nmeth.4256. Epub 2017 Apr 10.

Abstract

There is a need to better understand and handle the 'dark matter' of proteomics-the vast diversity of post-translational and chemical modifications that are unaccounted in a typical mass spectrometry-based analysis and thus remain unidentified. We present a fragment-ion indexing method, and its implementation in peptide identification tool MSFragger, that enables a more than 100-fold improvement in speed over most existing proteome database search tools. Using several large proteomic data sets, we demonstrate how MSFragger empowers the open database search concept for comprehensive identification of peptides and all their modified forms, uncovering dramatic differences in modification rates across experimental samples and conditions. We further illustrate its utility using protein-RNA cross-linked peptide data and using affinity purification experiments where we observe, on average, a 300% increase in the number of identified spectra for enriched proteins. We also discuss the benefits of open searching for improved false discovery rate estimation in proteomics.

MeSH terms

  • Algorithms
  • Computational Biology / instrumentation
  • Computational Biology / methods*
  • Databases, Protein
  • HEK293 Cells
  • Humans
  • Peptide Fragments / chemistry*
  • Protein Processing, Post-Translational
  • Proteome / chemistry*
  • Proteomics / instrumentation
  • Proteomics / methods*
  • Tandem Mass Spectrometry / methods*

Substances

  • Peptide Fragments
  • Proteome