Predicting lncRNA-Disease Associations Based on a Dual-Path Feature Extraction Network with Multiple Sources of Information Integration

Dengju Yao; Binbin Zhang; Xiaojuan Zhan; Bo Zhang; Xiang Kui Li

doi:10.1021/acsomega.4c05365

Predicting lncRNA-Disease Associations Based on a Dual-Path Feature Extraction Network with Multiple Sources of Information Integration

ACS Omega. 2024 Jul 30;9(32):35100-35112. doi: 10.1021/acsomega.4c05365. eCollection 2024 Aug 13.

Authors

Dengju Yao¹, Binbin Zhang¹, Xiaojuan Zhan^{1

2}, Bo Zhang¹, Xiang Kui Li¹

Affiliations

¹ School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China.
² College of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin 150050, China.

Abstract

Identifying the associations between long noncoding RNAs (lncRNAs) and disease is critical for disease prevention, diagnosis and treatment. However, conducting wet experiments to discover these associations is time-consuming and costly. Therefore, computational modeling for predicting lncRNA-disease associations (LDAs) has become an important alternative. To enhance the accuracy of LDAs prediction and alleviate the issue of node feature oversmoothing when exploring the potential features of nodes using graph neural networks, we introduce DPFELDA, a dual-path feature extraction network that leverages the integration of information from multiple sources to predict LDA. Initially, we establish a dual-view structure of lncRNAs and disease and a heterogeneous network of lncRNA-disease-microRNA (miRNA) interactions. Subsequently, features are extracted using a dual-path feature extraction network. In particular, we employ a combination of a graph convolutional network, a convolutional block attention module, and a node aggregation layer to perform multilayer topology feature extraction for the dual-view structure of lncRNAs and diseases. Additionally, we utilize a Transformer model to construct the node topology feature residual network for obtaining node-specific features in heterogeneous networks. Finally, XGBoost is employed for LDA prediction. The experimental results demonstrate that DPFELDA outperforms the benchmark model on various benchmark data sets. In the course of model exploration, it becomes evident that DPFELDA successfully alleviates the issue of node feature oversmoothing induced by graph-based learning. Ablation experiments confirm the effectiveness of the innovative module, and a case study substantiates the accuracy of DPFELDA model in predicting novel LDAs for characteristic diseases.