Incorporating inter-relationships between different levels of genomic data into cancer clinical outcome prediction

Methods. 2014 Jun 1;67(3):344-53. doi: 10.1016/j.ymeth.2014.02.003. Epub 2014 Feb 18.

Abstract

In order to improve our understanding of cancer and develop multi-layered theoretical models for the underlying mechanism, it is essential to have enhanced understanding of the interactions between multiple levels of genomic data that contribute to tumor formation and progression. Although there exist recent approaches such as a graph-based framework that integrates multi-omics data including copy number alteration, methylation, gene expression, and miRNA data for cancer clinical outcome prediction, most of previous methods treat each genomic data as independent and the possible interplay between them is not explicitly incorporated to the model. However, cancer is dysregulated by multiple levels in the biological system through genomic, epigenomic, transcriptomic, and proteomic level. Thus, genomic features are likely to interact with other genomic features in the different genomic levels. In order to deepen our knowledge, it would be desirable to incorporate such inter-relationship information when integrating multi-omics data for cancer clinical outcome prediction. In this study, we propose a new graph-based framework that integrates not only multi-omics data but inter-relationship between them for better elucidating cancer clinical outcomes. In order to highlight the validity of the proposed framework, serous cystadenocarcinoma data from TCGA was adopted as a pilot task. The proposed model incorporating inter-relationship between different genomic features showed significantly improved performance compared to the model that does not consider inter-relationship when integrating multi-omics data. For the pair between miRNA and gene expression data, the model integrating miRNA, for example, gene expression, and inter-relationship between them with an AUC of 0.8476 (REI) outperformed the model combining miRNA and gene expression data with an AUC of 0.8404. Similar results were also obtained for other pairs between different levels of genomic data. Integration of different levels of data and inter-relationship between them can aid in extracting new biological knowledge by drawing an integrative conclusion from many pieces of information collected from diverse types of genomic data, eventually leading to more effective screening strategies and alternative therapies that may improve outcomes.

Keywords: Clinical outcome prediction; Data integration; Inter-relationship; Multi-omics data; Ovarian cancer; TCGA.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cystadenocarcinoma / diagnosis
  • Cystadenocarcinoma / genetics*
  • Cystadenocarcinoma / therapy
  • Female
  • Gene Expression Profiling
  • Genomics / methods*
  • Humans
  • Ovarian Neoplasms / diagnosis
  • Ovarian Neoplasms / genetics*
  • Ovarian Neoplasms / therapy
  • Precision Medicine
  • Prognosis
  • Treatment Outcome