A comprehensive study on machine learning models combining with oversampling for bronchopulmonary dysplasia-associated pulmonary hypertension in very preterm infants

Respir Res. 2024 May 8;25(1):199. doi: 10.1186/s12931-024-02797-z.

Abstract

Background: Bronchopulmonary dysplasia-associated pulmonary hypertension (BPD-PH) remains a devastating clinical complication seriously affecting the therapeutic outcome of preterm infants. Hence, early prevention and timely diagnosis prior to pathological change is the key to reducing morbidity and improving prognosis. Our primary objective is to utilize machine learning techniques to build predictive models that could accurately identify BPD infants at risk of developing PH.

Methods: The data utilized in this study were collected from neonatology departments of four tertiary-level hospitals in China. To address the issue of imbalanced data, oversampling algorithms synthetic minority over-sampling technique (SMOTE) was applied to improve the model.

Results: Seven hundred sixty one clinical records were collected in our study. Following data pre-processing and feature selection, 5 of the 46 features were used to build models, including duration of invasive respiratory support (day), the severity of BPD, ventilator-associated pneumonia, pulmonary hemorrhage, and early-onset PH. Four machine learning models were applied to predictive learning, and after comprehensive selection a model was ultimately selected. The model achieved 93.8% sensitivity, 85.0% accuracy, and 0.933 AUC. A score of the logistic regression formula greater than 0 was identified as a warning sign of BPD-PH.

Conclusions: We comprehensively compared different machine learning models and ultimately obtained a good prognosis model which was sufficient to support pediatric clinicians to make early diagnosis and formulate a better treatment plan for pediatric patients with BPD-PH.

Keywords: Bronchopulmonary dysplasia; Machine learning; Oversampling; Prediction model; Pulmonary hypertension.

Publication types

  • Multicenter Study

MeSH terms

  • Bronchopulmonary Dysplasia* / diagnosis
  • Female
  • Humans
  • Hypertension, Pulmonary* / diagnosis
  • Infant, Extremely Premature
  • Infant, Newborn
  • Infant, Premature
  • Machine Learning*
  • Male
  • Retrospective Studies