Data Normalization of (1)H NMR Metabolite Fingerprinting Data Sets in the Presence of Unbalanced Metabolite Regulation

J Proteome Res. 2015 Aug 7;14(8):3217-28. doi: 10.1021/acs.jproteome.5b00192. Epub 2015 Jul 22.

Abstract

Data normalization is an essential step in NMR-based metabolomics. Conducted properly, it improves data quality and removes unwanted biases. The choice of the appropriate normalization method is critical and depends on the inherent properties of the data set in question. In particular, the presence of unbalanced metabolic regulation, where the different specimens and cohorts under investigation do not contain approximately equal shares of up- and down-regulated features, may strongly influence data normalization. Here, we demonstrate the suitability of the Shapiro-Wilk test to detect such unbalanced regulation. Next, employing a Latin-square design consisting of eight metabolites spiked into a urine specimen at eight different known concentrations, we show that commonly used normalization and scaling methods fail to retrieve true metabolite concentrations in the presence of increasing amounts of glucose added to simulate unbalanced regulation. However, by learning the normalization parameters on a subset of nonregulated features only, Linear Baseline Normalization, Probabilistic Quotient Normalization, and Variance Stabilization Normalization were found to account well for different dilutions of the samples without distorting the true spike-in levels even in the presence of marked unbalanced metabolic regulation. Finally, the methods described were applied successfully to a real world example of unbalanced regulation, namely, a set of plasma specimens collected from patients with and without acute kidney injury after cardiac surgery with cardiopulmonary bypass use.

Keywords: NMR; confounding factors; data normalization; metabolomics; unbalanced regulation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acute Kidney Injury / blood
  • Acute Kidney Injury / metabolism
  • Algorithms
  • Biometry / methods*
  • Cardiopulmonary Bypass
  • Humans
  • Metabolome*
  • Metabolomics / methods*
  • Probability
  • Proton Magnetic Resonance Spectroscopy / methods*
  • Reproducibility of Results