Extracting a low-dimensional description of multiple gene expression datasets reveals a potential driver for tumor-associated stroma in ovarian cancer

Genome Med. 2016 Jun 10;8(1):66. doi: 10.1186/s13073-016-0319-7.

Abstract

Patterns in expression data conserved across multiple independent disease studies are likely to represent important molecular events underlying the disease. We present the INSPIRE method to infer modules of co-expressed genes and the dependencies among the modules from multiple expression datasets that may contain different sets of genes. We show that INSPIRE infers more accurate models than existing methods to extract low-dimensional representation of expression data. We demonstrate that applying INSPIRE to nine ovarian cancer datasets leads to a new marker and potential driver of tumor-associated stroma, HOPX, followed by experimental validation. The implementation of INSPIRE is available at http://inspire.cs.washington.edu .

Keywords: Conditional dependence; Gene expression; HOPX; Latent variable; Low-dimensional representation; Module; Tumor-associated stroma; Variable discrepancy.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Computational Biology / methods*
  • Databases, Genetic
  • Female
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic
  • Homeodomain Proteins / genetics*
  • Homeodomain Proteins / metabolism
  • Humans
  • Ovarian Neoplasms / genetics*
  • Tumor Suppressor Proteins / genetics*
  • Tumor Suppressor Proteins / metabolism
  • Unsupervised Machine Learning

Substances

  • Biomarkers, Tumor
  • HOPX protein, human
  • Homeodomain Proteins
  • Tumor Suppressor Proteins