Estimation of postpartum depression risk from electronic health records using machine learning

Guy Amit; Irena Girshovitz; Karni Marcus; Yiye Zhang; Jyotishman Pathak; Vered Bar; Pinchas Akiva

doi:10.1186/s12884-021-04087-8

Estimation of postpartum depression risk from electronic health records using machine learning

BMC Pregnancy Childbirth. 2021 Sep 17;21(1):630. doi: 10.1186/s12884-021-04087-8.

Authors

Guy Amit¹, Irena Girshovitz², Karni Marcus², Yiye Zhang³, Jyotishman Pathak³, Vered Bar⁴, Pinchas Akiva²

Affiliations

¹ KI Research Institute, Kfar Malal, Israel. guy@kinstitute.org.il.
² KI Research Institute, Kfar Malal, Israel.
³ Division of Health Informatics, Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA.
⁴ Women's Mental Health, Sheba Medical Center, Ramat Gan, Israel.

Abstract

Background: Postpartum depression is a widespread disorder, adversely affecting the well-being of mothers and their newborns. We aim to utilize machine learning for predicting risk of postpartum depression (PPD) using primary care electronic health records (EHR) data, and to evaluate the potential value of EHR-based prediction in improving the accuracy of PPD screening and in early identification of women at risk.

Methods: We analyzed EHR data of 266,544 women from the UK who gave first live birth between 2000 and 2017. We extracted a multitude of socio-demographic and medical variables and constructed a machine learning model that predicts the risk of PPD during the year following childbirth. We evaluated the model's performance using multiple validation methodologies and measured its accuracy as a stand-alone tool and as an adjunct to the standard questionnaire-based screening by Edinburgh postnatal depression scale (EPDS).

Results: The prevalence of PPD in the analyzed cohort was 13.4%. Combing EHR-based prediction with EPDS score increased the area under the receiver operator characteristics curve (AUC) from 0.805 to 0.844 and the sensitivity from 0.72 to 0.76, at specificity of 0.80. The AUC of the EHR-based prediction model alone varied from 0.72 to 0.74 and decreased by only 0.01-0.02 when applied as early as before the beginning of pregnancy.

Conclusions: PPD risk prediction using EHR data may provide a complementary quantitative and objective tool for PPD screening, allowing earlier (pre-pregnancy) and more accurate identification of women at risk, timely interventions and potentially improved outcomes for the mother and child.

Keywords: Electronic health records; Machine learning; Postpartum depression.

MeSH terms

Adolescent
Adult
Area Under Curve
Cohort Studies
Depression, Postpartum / epidemiology*
Electronic Health Records
Female
Humans
Machine Learning
Middle Aged
Pregnancy
Risk Assessment / methods*
Risk Factors
United Kingdom / epidemiology
Young Adult

Abstract

MeSH terms

Grants and funding