SGTCDA: Prediction of circRNA-drug sensitivity associations with interpretable graph transformers and effective assessment

BMC Genomics. 2024 Nov 20;25(1):1113. doi: 10.1186/s12864-024-11022-6.

Abstract

CircRNAs are a type of circular non-coding RNA whose associations with drug sensitivities have been demonstrated in recent studies. Due to the high cost of biomedical experiments for detecting the associations between circRNAs and drug sensitivities, several computational methods have been developed. However, these methods were evaluated mainly based on 5- or tenfold cross-validation, which are often over-optimistic. Furthermore, there are technique issues with these models, such as over-smoothing and over-squashing. To address these issues, we propose a strategy to evaluate models based on independent test sets for association prediction-related studies. In the light of this effective assessment, we constructed a model, SGTCDA, by integrating structural deep network embedding (SDNE) and a graph transformer to predict the potential associations of circRNA-drug sensitivity, which can efficiently capture long-range dependencies and local structural information of nodes. Our results on the training sets and the independent test sets indicate that SGTCDA outperforms the other state-of-the-art models, demonstrating its capacity for accurate prediction of circRNA-drug sensitivity. Moreover, we leveraged EdgeSHAPer to explain the performance of the proposed SGTCDA model, which illustrates that the edges between drugs are more important than other edges for the performance of the model. The source code and dataset of SGTCDA are available at: https://github.com/hwxia/SGTCDA .

Keywords: CircRNA-drug sensitivity association; EdgeSHAPer; Graph transformer; Independent test sets; SDNE.

MeSH terms

  • Algorithms
  • Computational Biology* / methods
  • Humans
  • RNA, Circular* / genetics

Substances

  • RNA, Circular