Comprehensive Analysis of Bile Medicines Based on UHPLC-QTOF-MSE and Machine Learning

ACS Omega. 2024 Oct 8;9(42):43264-43271. doi: 10.1021/acsomega.4c08260. eCollection 2024 Oct 22.

Abstract

Based on UHPLC-QTOF-MSE analysis and quantized processing, combined with machine learning algorithms, data modeling was carried out to realize digital identification of bear bile powder (BBP), chicken bile powder (CIBP), duck bile powder (DBP), cow bile powder (CBP), sheep bile powder (SBP), pig bile powder (PBP), snake bile powder (SNBP), rabbit bile powder (RBP), and goose bile powder (GBP). First, 173 batches of bile samples were analyzed by UHPLC-QTOF-MSE to obtain the retention time-exact mass (RTEM) data pair to identify bile acid-like chemical components. Then, the data were modeled by combining support vector machine (SVM), random forest (RF), artificial neural network (ANN), gradient boosting (GB), AdaBoost (AB), and Naive Bayes (NB), and the models were evaluated by the parameters of accuracy (Acc), precision (P), and area under the curve (AUC). Finally, the bile medicines were digitally identified based on the optimal model. The results showed that the RF model constructed based on the identified 12 bile acid-like chemical constituents and random forest algorithm is optimal with ACC, P, and AUC > 0.950. In addition, the accuracy of external identification verification of 42 batches of bile medicines detected at different times is 100.0%. So based on UHPLC-QTOF-MSE analysis and combined with the RF algorithm, it can efficiently and accurately realize the digital identification of bile medicines, which can provide reference and assistance for the quality control of bile medicines. In addition, hyodeoxycholic acid, glycohyodeoxycholic acid, and taurochenodeoxycholic acid, and so forth are the most important bile acid constituents for the identification of nine bile medicines.