Aim: To develop and evaluate an effective classification approach without biochemical parameters to identify those at high risk of T2DM in rural adults.
Methods: A cross-sectional survey was conducted. Of 8640 subjects who met inclusion criteria, 75% (N1=6480) were randomly selected to provide training set for constructing artificial neural network (ANN) and multivariate logistic regression (MLR) models. The remaining 25% (N2=2160) were assigned to validation set for performance comparisons of the ANN and MLR models. Predictive performance of different models was analyzed by the receiver operating characteristic (ROC) curve using the validation set.
Results: The prevalence rates of T2DM were 8.66% (n=561) and 9.21% (n=199) in training and validation sets, respectively. For ANN model, the sensitivity, specificity, positive and negative predictive value for identifying T2DM were 86.93%, 79.14%, 31.86%, and 98.18%, respectively, while MLR model were only 60.80%, 75.48%, 21.78%, and 94.52%, respectively. Area under the ROC curve (AUC) value for identifying T2DM when using the ANN model was 0.891, showing more accurate predictive performance than the MLR model (AUC=0.744) (P=0.0001).
Conclusion: The ANN model is an effective classification approach for identifying those at high risk of T2DM based on demographic, lifestyle and anthropometric data.
Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.