Strategies to Improve Convolutional Neural Network Generalizability and Reference Standards for Glaucoma Detection From OCT Scans

Kaveri A Thakoor; Xinhui Li; Emmanouil Tsamis; Zane Z Zemborain; Carlos Gustavo De Moraes; Paul Sajda; Donald C Hood

doi:10.1167/tvst.10.4.16

Strategies to Improve Convolutional Neural Network Generalizability and Reference Standards for Glaucoma Detection From OCT Scans

Transl Vis Sci Technol. 2021 Apr 1;10(4):16. doi: 10.1167/tvst.10.4.16.

Authors

Kaveri A Thakoor¹, Xinhui Li², Emmanouil Tsamis², Zane Z Zemborain², Carlos Gustavo De Moraes³, Paul Sajda^{1

4

5}, Donald C Hood^{2

3}

Affiliations

¹ Department of Biomedical Engineering, Columbia University, New York, NY, USA.
² Department of Psychology, Columbia University, New York, NY, USA.
³ Department of Ophthalmology, Columbia University, New York, NY, USA.
⁴ Department of Electrical Engineering, Columbia University, New York, NY, USA.
⁵ Department of Radiology (Physics), Columbia University, New York, NY, USA.

Abstract

Purpose: To develop and evaluate methods to improve the generalizability of convolutional neural networks (CNNs) trained to detect glaucoma from optical coherence tomography retinal nerve fiber layer probability maps, as well as optical coherence tomography circumpapillary disc (circle) b-scans, and to explore impact of reference standard (RS) on CNN accuracy.

Methods: CNNs previously optimized for glaucoma detection from retinal nerve fiber layer probability maps, and newly developed CNNs adapted for glaucoma detection from optical coherence tomography b-scans, were evaluated on an unseen dataset (i.e., data collected at a different site). Multiple techniques were used to enhance CNN generalizability, including augmenting the training dataset, using multimodal input, and training with confidently rated images. Model performance was evaluated with different RS.

Results: Training with data augmentation and training on confident images enhanced the accuracy of the CNNs for glaucoma detection on a new dataset by 5% to 9%. CNN performance was optimal when a similar RS was used to establish labels both for the training and the testing sets. However, interestingly, the CNNs described here were robust to variation in the RS.

Conclusions: CNN generalizability can be improved with data augmentation, multiple input image modalities, and training on images with confident ratings. CNNs trained and tested with the same RS achieved best accuracy, suggesting that choosing a thorough and consistent RS for training and testing improves generalization to new datasets.

Translational relevance: Strategies for enhancing CNN generalizability and for choosing optimal RS should be standard practice for CNNs before their deployment for glaucoma detection.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Deep Learning*
Glaucoma* / diagnosis
Humans
Neural Networks, Computer
Reference Standards
Tomography, Optical Coherence

Abstract

Publication types

MeSH terms

Grants and funding