MURDA: Multisource Unsupervised Raman Spectroscopy Domain Adaptation Model with Reconstructed Target Domains for Medical Diagnosis Assistance

Anal Chem. 2024 Sep 20. doi: 10.1021/acs.analchem.4c01581. Online ahead of print.

Abstract

Artificial intelligence combined with Raman spectroscopy for disease diagnosis is on the rise. However, these methods require a large amount of annotated spectral data for modeling to achieve high diagnostic accuracy. Annotating labels consumes significant medical resources and time. To reduce dependence on labeled medical data resources, we propose a method called Multisource Unsupervised Raman Spectroscopy Domain Adaptation Model with Reconstructed Target Domains (MURDA). It transfers knowledge learned from source domain data sets of different diseases to an unlabeled target domain data set. Compared to knowledge transfer from a single source domain, knowledge from multiple disease source domains provides more generalized knowledge. Considering the diversity of autoimmune diseases and the limited sample size, we apply MURDA to assist in the medical diagnosis of autoimmune diseases. Additionally, we propose a Double-Branch Multiscale Convolutional Self-Attention (DMCS) feature extractor that is more suitable for spectral data feature extraction. On three sets of serum Raman spectroscopy data sets for autoimmune diseases, the multisource domain adaptation diagnostic accuracy of MURDA was superior to traditional single source and multisource models, with accuracy rates of 73.6%, 83.4%, and 82.9%, respectively. Compared with pure source tasks without domain adaptation, it improved by 15.1%, 36%, and 21.6%, respectively. This demonstrates the effectiveness of Raman spectroscopy combined with MURDA in diagnosing autoimmune diseases. We investigated the important decision dependency peaks in migration tasks, providing assistance for future research on artificial intelligence combined with Raman spectroscopy for diagnosing autoimmune diseases. Furthermore, to validate the effectiveness and generalization performance of MURDA, we conducted experiments on the publicly available RRUFF data set, exploring the application of multisource unsupervised domain adaptation in more Raman spectroscopy scenarios.