Development of Debiasing Technique for Lung Nodule Chest X-ray Datasets to Generalize Deep Learning Models

Michael J Horry; Subrata Chakraborty; Biswajeet Pradhan; Manoranjan Paul; Jing Zhu; Hui Wen Loh; Prabal Datta Barua; U Rajendra Acharya

doi:10.3390/s23146585

Development of Debiasing Technique for Lung Nodule Chest X-ray Datasets to Generalize Deep Learning Models

Sensors (Basel). 2023 Jul 21;23(14):6585. doi: 10.3390/s23146585.

Authors

Michael J Horry^{1

2}, Subrata Chakraborty^{1

3}, Biswajeet Pradhan^{1

4}, Manoranjan Paul⁵, Jing Zhu⁶, Hui Wen Loh⁷, Prabal Datta Barua^{1

3

8

9}, U Rajendra Acharya¹⁰

Affiliations

¹ Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia.
² IBM Australia Limited, Sydney, NSW 2000, Australia.
³ Faculty of Science, Agriculture, Business and Law, University of New England, Armidale, NSW 2351, Australia.
⁴ Earth Observation Center, Institute of Climate Change, Universiti Kebangsaan Malaysia, Bangi 43600, Malaysia.
⁵ Machine Vision and Digital Health (MaViDH), School of Computing and Mathematics, Charles Sturt University, Bathurst, NSW 2795, Australia.
⁶ Department of Radiology, Westmead Hospital, Westmead, NSW 2145, Australia.
⁷ School of Science and Technology, Singapore University of Social Sciences, Singapore 599494, Singapore.
⁸ Cogninet Brain Team, Cogninet Australia, Sydney, NSW 2010, Australia.
⁹ School of Business (Information Systems), Faculty of Business, Education, Law & Arts, University of Southern Queensland, Toowoomba, QLD 4350, Australia.
¹⁰ School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield, QLD 4300, Australia.

Abstract

Screening programs for early lung cancer diagnosis are uncommon, primarily due to the challenge of reaching at-risk patients located in rural areas far from medical facilities. To overcome this obstacle, a comprehensive approach is needed that combines mobility, low cost, speed, accuracy, and privacy. One potential solution lies in combining the chest X-ray imaging mode with federated deep learning, ensuring that no single data source can bias the model adversely. This study presents a pre-processing pipeline designed to debias chest X-ray images, thereby enhancing internal classification and external generalization. The pipeline employs a pruning mechanism to train a deep learning model for nodule detection, utilizing the most informative images from a publicly available lung nodule X-ray dataset. Histogram equalization is used to remove systematic differences in image brightness and contrast. Model training is then performed using combinations of lung field segmentation, close cropping, and rib/bone suppression. The resulting deep learning models, generated through this pre-processing pipeline, demonstrate successful generalization on an independent lung nodule dataset. By eliminating confounding variables in chest X-ray images and suppressing signal noise from the bone structures, the proposed deep learning lung nodule detection algorithm achieves an external generalization accuracy of 89%. This approach paves the way for the development of a low-cost and accessible deep learning-based clinical system for lung cancer screening.

Keywords: chest X-ray; confounding bias; deep learning; federated learning; lung cancer; model generalization.

MeSH terms

Deep Learning*
Early Detection of Cancer
Humans
Lung
Lung Neoplasms* / diagnostic imaging
Neural Networks, Computer
X-Rays

Grants and funding

This research received no external funding.