An exploratory data analysis method to reveal modular latent structures in high-throughput data

BMC Bioinformatics. 2010 Aug 27:11:440. doi: 10.1186/1471-2105-11-440.

Abstract

Background: Modular structures are ubiquitous across various types of biological networks. The study of network modularity can help reveal regulatory mechanisms in systems biology, evolutionary biology and developmental biology. Identifying putative modular latent structures from high-throughput data using exploratory analysis can help better interpret the data and generate new hypotheses. Unsupervised learning methods designed for global dimension reduction or clustering fall short of identifying modules with factors acting in linear combinations.

Results: We present an exploratory data analysis method named MLSA (Modular Latent Structure Analysis) to estimate modular latent structures, which can find co-regulative modules that involve non-coexpressive genes.

Conclusions: Through simulations and real-data analyses, we show that the method can recover modular latent structures effectively. In addition, the method also performed very well on data generated from sparse global latent factor models. The R code is available at http://userwww.service.emory.edu/~tyu8/MLSA/.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Carcinoma, Squamous Cell / genetics
  • Cell Cycle
  • Cell Line, Tumor
  • Computer Simulation
  • Gene Regulatory Networks*
  • Humans
  • Information Systems*
  • Lung Neoplasms / genetics
  • Models, Biological*
  • Models, Statistical
  • Systems Biology
  • Systems Theory*
  • Yeasts / physiology