Open-source python module for automated preprocessing of near infrared spectroscopic data

Anal Chim Acta. 2020 Apr 29:1108:1-9. doi: 10.1016/j.aca.2020.02.030. Epub 2020 Feb 17.

Abstract

Near infrared spectroscopy (NIRS) is an analytical technique for determining the chemical composition or structure of a given sample. For several decades, NIRS has been a frequently used analysis tool in agriculture, pharmacology, medicine, and petrochemistry. The popularity of NIRS is constantly growing as new application areas are discovered. Contrary to mid infrared spectral region, the absorption bands in near infrared spectral region are often non-specific, broad, and overlapping. Analysis of NIR spectra requires multivariate methods which are highly subjective to noise arising from instrumentation, scattering effects, and measurement setup. NIRS measurements are also frequently performed outside of a laboratory which further contributes to the presence of noise. Therefore, preprocessing is a critical step in NIRS as it can vastly improve the performance of multivariate models. While extensive research regarding various preprocessing methods exists, selection of the best preprocessing method is often determined through trial-and-error. A more powerful approach for optimizing preprocessing in NIRS models would be to automatically compare a large number of preprocessing techniques (e.g., through grid-search or hyperparameter tuning). To enable this, we present, nippy, an open-source Python module for semi-automatic comparison of NIRS preprocessing methods (available at https://github.com/uef-bbc/nippy). We provide here a brief overview of the capabilities of nippy and demonstrate the typical usage through two examples with public datasets.

Keywords: Chemometrics; Near infrared spectroscopy; Preprocessing.