Discriminative, restorative, and adversarial learning have proven beneficial for self-supervised learning schemes in computer vision and medical imaging. Existing efforts, however, fail to capitalize on the potentially synergistic effects these methods may offer in a ternary setup, which, we envision can significantly benefit deep semantic representation learning. Towards this end, we developed DiRA, the first framework that unites discriminative, restorative, and adversarial learning in a unified manner to collaboratively glean complementary visual information from unlabeled medical images for fine-grained semantic representation learning. Our extensive experiments demonstrate that DiRA: (1) encourages collaborative learning among three learning ingredients, resulting in more generalizable representation across organs, diseases, and modalities; (2) outperforms fully supervised ImageNet models and increases robustness in small data regimes, reducing annotation cost across multiple medical imaging applications; (3) learns fine-grained semantic representation, facilitating accurate lesion localization with only image-level annotation; (4) improves reusability of low/mid-level features; and (5) enhances restorative self-supervised approaches, revealing that DiRA is a general framework for united representation learning. Code and pretrained models are available at https://github.com/JLiangLab/DiRA.
Keywords: Fine-grained representation learning; Self-supervised Learning; Transfer Learning.
Copyright © 2024 Elsevier B.V. All rights reserved.