Applying contrastive pre-training for depression and anxiety risk prediction in type 2 diabetes patients based on heterogeneous electronic health records: a primary healthcare case study

J Am Med Inform Assoc. 2024 Jan 18;31(2):445-455. doi: 10.1093/jamia/ocad228.

Abstract

Objective: Due to heterogeneity and limited medical data in primary healthcare services (PHS), assessing the psychological risk of type 2 diabetes mellitus (T2DM) patients in PHS is difficult. Using unsupervised contrastive pre-training, we proposed a deep learning framework named depression and anxiety prediction (DAP) to predict depression and anxiety in T2DM patients.

Materials and methods: The DAP model consists of two sub-models. Firstly, the pre-trained model of DAP used unlabeled discharge records of 85 085 T2DM patients from the First Affiliated Hospital of Nanjing Medical University for unsupervised contrastive learning on heterogeneous electronic health records (EHRs). Secondly, the fine-tuned model of DAP used case-control cohorts (17 491 patients) selected from 149 596 T2DM patients' EHRs in the Nanjing Health Information Platform (NHIP). The DAP model was validated in 1028 patients from PHS in NHIP. Evaluation included receiver operating characteristic area under the curve (ROC-AUC) and precision-recall area under the curve (PR-AUC), and decision curve analysis (DCA).

Results: The pre-training step allowed the DAP model to converge at a faster rate. The fine-tuned DAP model significantly outperformed the baseline models (logistic regression, extreme gradient boosting, and random forest) with ROC-AUC of 0.91±0.028 and PR-AUC of 0.80±0.067 in 10-fold internal validation, and with ROC-AUC of 0.75 ± 0.045 and PR-AUC of 0.47 ± 0.081 in external validation. The DCA indicate the clinical potential of the DAP model.

Conclusion: The DAP model effectively predicted post-discharge depression and anxiety in T2DM patients from PHS, reducing data fragmentation and limitations. This study highlights the DAP model's potential for early detection and intervention in depression and anxiety, improving outcomes for diabetes patients.

Keywords: EHR pre-trained model; deep learning; depression and anxiety; regional EHRs; type 2 diabetes mellitus.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Aftercare
  • Anxiety
  • Depression
  • Diabetes Mellitus, Type 2* / complications
  • Diabetes Mellitus, Type 2* / diagnosis
  • Electronic Health Records
  • Humans
  • Machine Learning
  • Patient Discharge