A reproducibility-based evaluation procedure for quantifying the differences between MS/MS peak intensity normalization methods

Proteomics. 2011 Mar;11(6):1172-80. doi: 10.1002/pmic.201000605. Epub 2011 Feb 7.

Abstract

The identification of peptides and proteins from fragmentation mass spectra is a very common approach in the field of proteomics. Contemporary high-throughput peptide identification pipelines can quickly produce large quantities of MS/MS data that contain valuable knowledge about the actual physicochemical processes involved in the peptide fragmentation process, which can be extracted through extensive data mining studies. As these studies attempt to exploit the intensity information contained in the MS/MS spectra, a critical step required for a meaningful comparison of this information between MS/MS spectra is peak intensity normalization. We here describe a procedure for quantifying the efficiency of different published normalization methods in terms of the quartile coefficient of dispersion (qcod) statistic. The quartile coefficient of dispersion is applied to measure the dispersion of the peak intensities between redundant MS/MS spectra, allowing the quantification of the differences in computed peak intensity reproducibility between the different normalization methods. We demonstrate that our results are independent of the data set used in the evaluation procedure, allowing us to provide generic guidance on the choice of normalization method to apply in a certain MS/MS pipeline application.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology
  • Data Interpretation, Statistical
  • Databases, Protein
  • Humans
  • Peptides / isolation & purification
  • Proteomics / methods
  • Proteomics / standards*
  • Proteomics / statistics & numerical data*
  • Quality Control
  • Reproducibility of Results
  • Tandem Mass Spectrometry / standards*
  • Tandem Mass Spectrometry / statistics & numerical data*

Substances

  • Peptides