Background: Adolescents often experience difficulties with sleep quality. The existing literature on predicting severe sleep disturbance is limited, primarily due to the absence of reliable tools.
Methods: This study analyzed 1966 university students. All participants were classified into a training set and a validation set at the ratio of 8:2 at random. Participants in the training set were utilized to establish models, and the logistic regression (LR) and five machine learning algorithms, including the eXtreme Gradient Boosting Machine (XGBM), Naïve Bayesian (NB), Support Vector Machine (SVM), Decision Tree (DT), CatBoosting Machine (CatBM), were utilized to develop models. Whereas, those in the validation set were used to validate the developed models.
Results: The incidence of severe sleep disturbance was 5.28% (104/1969). Among all developed models, the XGBM model performed best in AUC (0.872 [95%CI: 0.848-0.896]), followed by the CatBM model (0.853 [95% CI: 0.821-0.878]) and DT model (0.843 [95% CI: 0.801-0.870]), whereas the AUC of the logistic regression model was only 0.822 (95% CI: 0.777-0.856). Additionally, the XGBM model had the best accuracy (0.792), precision (0.780), F1 score (0.796), Brier score (0.143), and log loss (0.444).
Conclusions: The XGBM model may be a useful tool to estimate the risk of experiencing severe sleep disturbance among adolescents.
Keywords: Pittsburgh sleep quality index; adolescents; epidemiology; machine learning; prediction model; sleep disturbance.
Copyright © 2024 Zhang, Zhao, Yang, Yang, Wu, Zheng and Lei.