Weakly Supervised Deep Learning for Whole Slide Lung Cancer Image Analysis

IEEE Trans Cybern. 2020 Sep;50(9):3950-3962. doi: 10.1109/TCYB.2019.2935141. Epub 2019 Sep 2.

Abstract

Histopathology image analysis serves as the gold standard for cancer diagnosis. Efficient and precise diagnosis is quite critical for the subsequent therapeutic treatment of patients. So far, computer-aided diagnosis has not been widely applied in pathological field yet as currently well-addressed tasks are only the tip of the iceberg. Whole slide image (WSI) classification is a quite challenging problem. First, the scarcity of annotations heavily impedes the pace of developing effective approaches. Pixelwise delineated annotations on WSIs are time consuming and tedious, which poses difficulties in building a large-scale training dataset. In addition, a variety of heterogeneous patterns of tumor existing in high magnification field are actually the major obstacle. Furthermore, a gigapixel scale WSI cannot be directly analyzed due to the immeasurable computational cost. How to design the weakly supervised learning methods to maximize the use of available WSI-level labels that can be readily obtained in clinical practice is quite appealing. To overcome these challenges, we present a weakly supervised approach in this article for fast and effective classification on the whole slide lung cancer images. Our method first takes advantage of a patch-based fully convolutional network (FCN) to retrieve discriminative blocks and provides representative deep features with high efficiency. Then, different context-aware block selection and feature aggregation strategies are explored to generate globally holistic WSI descriptor which is ultimately fed into a random forest (RF) classifier for the image-level prediction. To the best of our knowledge, this is the first study to exploit the potential of image-level labels along with some coarse annotations for weakly supervised learning. A large-scale lung cancer WSI dataset is constructed in this article for evaluation, which validates the effectiveness and feasibility of the proposed method. Extensive experiments demonstrate the superior performance of our method that surpasses the state-of-the-art approaches by a significant margin with an accuracy of 97.3%. In addition, our method also achieves the best performance on the public lung cancer WSIs dataset from The Cancer Genome Atlas (TCGA). We highlight that a small number of coarse annotations can contribute to further accuracy improvement. We believe that weakly supervised learning methods have great potential to assist pathologists in histology image diagnosis in the near future.

MeSH terms

  • Deep Learning*
  • Histocytochemistry
  • Humans
  • Image Interpretation, Computer-Assisted / methods*
  • Lung Neoplasms* / chemistry
  • Lung Neoplasms* / diagnostic imaging
  • Lung Neoplasms* / pathology
  • Supervised Machine Learning*