We describe a new multivariate statistical approach to recover metabolite structure information from multiple (1)H NMR spectra in population sample sets. Subset optimization by reference matching (STORM) was developed to select subsets of (1)H NMR spectra that contain specific spectroscopic signatures of biomarkers differentiating between different human populations. STORM aims to improve the visualization of structural correlations in spectroscopic data by using these reduced spectral subsets containing smaller numbers of samples than the number of variables (n ≪ p). We have used statistical shrinkage to limit the number of false positive associations and to simplify the overall interpretation of the autocorrelation matrix. The STORM approach has been applied to findings from an ongoing human metabolome-wide association study on body mass index to identify a biomarker metabolite present in a subset of the population. Moreover, we have shown how STORM improves the visualization of more abundant NMR peaks compared to a previously published method (statistical total correlation spectroscopy, STOCSY). STORM is a useful new tool for biomarker discovery in the "omic" sciences that has widespread applicability. It can be applied to any type of data, provided that there is interpretable correlation among variables, and can also be applied to data with more than one dimension (e.g., 2D NMR spectra).