Performance of Automated Machine Learning in Predicting Outcomes of Pneumatic Retinopexy

Arina Nisanova; Arefeh Yavary; Jordan Deaner; Ferhina S Ali; Priyanka Gogte; Richard Kaplan; Kevin C Chen; Eric Nudleman; Dilraj Grewal; Meenakashi Gupta; Jeremy Wolfe; Michael Klufas; Glenn Yiu; Iman Soltani; Parisa Emami-Naeini

doi:10.1016/j.xops.2024.100470

Performance of Automated Machine Learning in Predicting Outcomes of Pneumatic Retinopexy

Ophthalmol Sci. 2024 Jan 19;4(5):100470. doi: 10.1016/j.xops.2024.100470. eCollection 2024 Sep-Oct.

Authors

Arina Nisanova¹, Arefeh Yavary², Jordan Deaner³, Ferhina S Ali⁴, Priyanka Gogte⁵, Richard Kaplan⁶, Kevin C Chen⁷, Eric Nudleman⁸, Dilraj Grewal⁹, Meenakashi Gupta⁶, Jeremy Wolfe⁵, Michael Klufas¹⁰, Glenn Yiu¹¹, Iman Soltani¹², Parisa Emami-Naeini¹¹

Affiliations

¹ School of Medicine, University of California Davis, Davis, California.
² Department of Computer Science, University of California Davis, Davis, California.
³ Mid Atlantic Retina, Wills Eye Hospital, Philadelphia, Pennsylvania.
⁴ New York Medical College, Valhalla, New York.
⁵ Associated Retinal Consultants, Royal Oak, Michigan.
⁶ New York Eye and Ear Infirmary of Mount Sinai, New York, New York.
⁷ Vantage Eye Center, Salinas, California.
⁸ Shiley Eye Center, University of California San Diego, La Jolla, California.
⁹ Eye Center, Duke University, Durham, North Carolina.
¹⁰ Wills Eye Hospital, Thomas Jefferson University, Philadelphia, Pennsylvania.
¹¹ Tschannen Eye Institute, University of California Davis, Sacramento, California.
¹² Department of Mechanical and Aerospace Engineering, University of California Davis, Davis, California.

Abstract

Purpose: Automated machine learning (AutoML) has emerged as a novel tool for medical professionals lacking coding experience, enabling them to develop predictive models for treatment outcomes. This study evaluated the performance of AutoML tools in developing models predicting the success of pneumatic retinopexy (PR) in treatment of rhegmatogenous retinal detachment (RRD). These models were then compared with custom models created by machine learning (ML) experts.

Design: Retrospective multicenter study.

Participants: Five hundred and thirty nine consecutive patients with primary RRD that underwent PR by a vitreoretinal fellow at 6 training hospitals between 2002 and 2022.

Methods: We used 2 AutoML platforms: MATLAB Classification Learner and Google Cloud AutoML. Additional models were developed by computer scientists. We included patient demographics and baseline characteristics, including lens and macula status, RRD size, number and location of breaks, presence of vitreous hemorrhage and lattice degeneration, and physicians' experience. The dataset was split into a training (n = 483) and test set (n = 56). The training set, with a 2:1 success-to-failure ratio, was used to train the MATLAB models. Because Google Cloud AutoML requires a minimum of 1000 samples, the training set was tripled to create a new set with 1449 datapoints. Additionally, balanced datasets with a 1:1 success-to-failure ratio were created using Python.

Main outcome measures: Single-procedure anatomic success rate, as predicted by the ML models. F2 scores and area under the receiver operating curve (AUROC) were used as primary metrics to compare models.

Results: The best performing AutoML model (F2 score: 0.85; AUROC: 0.90; MATLAB), showed comparable performance to the custom model (0.92, 0.86) when trained on the balanced datasets. However, training the AutoML model with imbalanced data yielded misleadingly high AUROC (0.81) despite low F2-score (0.2) and sensitivity (0.17).

Conclusions: We demonstrated the feasibility of using AutoML as an accessible tool for medical professionals to develop models from clinical data. Such models can ultimately aid in the clinical decision-making, contributing to better patient outcomes. However, outcomes can be misleading or unreliable if used naively. Limitations exist, particularly if datasets contain missing variables or are highly imbalanced. Proper model selection and data preprocessing can improve the reliability of AutoML tools.

Financial disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

Keywords: Automated machine learning (AutoML); Machine learning; Medical outcome prediction; Pneumatic retinopexy; Rhegmatogenous retinal detachment.