Advancing Pan-cancer Gene Expression Survial Analysis by Inclusion of Non-coding RNA

RNA Biol. 2020 Nov;17(11):1666-1673. doi: 10.1080/15476286.2019.1679585. Epub 2019 Oct 18.

Abstract

Non-coding RNAs occupy a significant fraction of the human genome. Their biological significance is backed up by a plethora of emerging evidence. One of the most robust approaches to demonstrate non-coding RNA's biological relevance is through their prognostic value. Using the rich gene expression data from The Cancer Genome Altas (TCGA), we designed Advanced Expression Survival Analysis (AESA), a web tool which provides several novel survival analysis approaches not offered by previous tools. In addition to the common single-gene approach, AESA computes the gene expression composite score of a set of genes for survival analysis and utilizes permutation test or cross-validation to assess the significance of log-rank statistic and the degree of over-fitting. AESA offers survival feature selection with post-selection inference and utilizes expanded TCGA clinical data including overall, disease-specific, disease-free, and progression-free survival information. Users can analyse either protein-coding or non-coding regions of the transcriptome. We demonstrated the effectiveness of AESA using several empirical examples. Our analyses showed that non-coding RNAs perform as well as messenger RNAs in predicting survival of cancer patients. These results reinforce the potential prognostic value of non-coding RNAs. AESA is developed as a module in the freely accessible analysis suite MutEx. Abbreviation: ACC: Adrenocortical Carcinoma (n = 92); BLCA: Bladder Urothelial Carcinoma (n = 412); BRCA: Breast Invasive Carcinoma (n = 1098); CESC: Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (n = 307); CHOL: Cholangiocarcinoma (n = 51); COAD: Colon Adenocarcinoma (n = 461); DLBC: Lymphoid Neoplasm Diffuse Large B-cell Lymphoma (n = 58); ESCA: Oesophageal Carcinoma (n = 185); GBM: Glioblastoma Multiforme (n = 617); HNSC: Head and Neck Squamous Cell Carcinoma (n = 528); KICH: Kidney Chromophobe (n = 113); KIRC: Kidney Renal Clear Cell Carcinoma (n = 537); KIRP: Kidney Renal Papillary Cell Carcinoma (n = 291); LAML: Acute Myeloid Leukaemia (n = 200); LGG: Brain Lower Grade Glioma (n = 516); LIHC: Liver Hepatocellular Carcinoma (n = 377); LUAD: Lung Adenocarcinoma (n = 585); LUSC: Lung Squamous Cell Carcinoma (n = 504); MESO: Mesothelioma (n = 87); OV: Ovarian Serous Cystadenocarcinoma (n = 608) PAAD: Pancreatic Adenocarcinoma (n = 185); PCPG: Pheochromocytoma and Paraganglioma (n = 179); PRAD: Prostate Adenocarcinoma (n = 500); READ: Rectum Adenocarcinoma (n = 172); SARC: Sarcoma (n = 261); SKCM: Skin Cutaneous Melanoma (n = 470); STAD: Stomach Adenocarcinoma (n = 443); TGCT: Testicular Germ Cell Tumours (n = 150); THCA: Thyroid Carcinoma (n = 507) THYM: Thymoma (n = 124); UCEC: Uterine Corpus Endometrial Carcinoma (n = 560); UCS: Uterine Carcinosarcoma (n = 57); UVM: Uveal Melanoma (n = 80).

Keywords: Cancer survival analysis; lincRNA; non-coding RNA; pseudogene.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Biomarkers, Tumor*
  • Computational Biology / methods
  • Databases, Genetic
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • Neoplasms / genetics*
  • Neoplasms / mortality*
  • Prognosis
  • RNA, Long Noncoding / genetics
  • RNA, Untranslated / genetics*

Substances

  • Biomarkers, Tumor
  • RNA, Long Noncoding
  • RNA, Untranslated