Honey authentication is a complex process which traditionally requires costly and time-consuming analytical techniques not readily available to the producers. This study aimed to develop non-invasive sensor methods coupled with a multivariate data analysis to detect the type and percentage of exogenous sugar adulteration in UK honeys. Through-container spatial offset Raman spectroscopy (SORS) was employed on 17 different types of natural honeys produced in the UK over a season. These samples were then spiked with rice and sugar beet syrups at the levels of 10%, 20%, 30%, and 50% w/w. The data acquired were used to construct prediction models for 14 types of honey with similar Raman fingerprints using different algorithms, namely PLS-DA, XGBoost, and Random Forest, with the aim to detect the level of adulteration per type of sugar syrup. The best-performing algorithm for classification was Random Forest, with only 1% of the pure honeys misclassified as adulterated and <3.5% of adulterated honey samples misclassified as pure. Random Forest was further employed to create a classification model which successfully classified samples according to the type of adulterant (rice or sugar beet) and the adulteration level. In addition, SORS spectra were collected from 27 samples of heather honey (24 Calluna vulgaris and 3 Erica cinerea) produced in the UK and corresponding subsamples spiked with high fructose sugar cane syrup, and an exploratory data analysis with PCA and a classification with Random Forest were performed, both showing clear separation between the pure and adulterated samples at medium (40%) and high (60%) adulteration levels and a 90% success at low adulteration levels (20%). The results of this study demonstrate the potential of SORS in combination with machine learning to be applied for the authentication of honey samples and the detection of exogenous sugars in the form of sugar syrups. A major advantage of the SORS technique is that it is a rapid, non-invasive method deployable in the field with potential application at all stages of the supply chain.
Keywords: SORS; classification; honey; random forest; regression.