Conserved co-expression for candidate disease gene prioritization

BMC Bioinformatics. 2008 Apr 23:9:208. doi: 10.1186/1471-2105-9-208.

Abstract

Background: Genes that are co-expressed tend to be involved in the same biological process. However, co-expression is not a very reliable predictor of functional links between genes. The evolutionary conservation of co-expression between species can be used to predict protein function more reliably than co-expression in a single species. Here we examine whether co-expression across multiple species is also a better prioritizer of disease genes than is co-expression between human genes alone.

Results: We use co-expression data from yeast (S. cerevisiae), nematode worm (C. elegans), fruit fly (D. melanogaster), mouse and human and find that the use of evolutionary conservation can indeed improve the predictive value of co-expression. The effect that genes causing the same disease have higher co-expression than do other genes from their associated disease loci, is significantly enhanced when co-expression data are combined across evolutionarily distant species. We also find that performance can vary significantly depending on the co-expression datasets used, and just using more data does not necessarily lead to better prioritization. Instead, we find that dataset quality is more important than quantity, and using a consistent microarray platform per species leads to better performance than using more inclusive datasets pooled from various platforms.

Conclusion: We find that evolutionarily conserved gene co-expression prioritizes disease candidate genes better than human gene co-expression alone, and provide the integrated data as a new resource for disease gene prioritization tools.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • Caenorhabditis elegans / genetics
  • Conserved Sequence*
  • Databases, Genetic
  • Disease / etiology*
  • Drosophila melanogaster / genetics
  • Evolution, Molecular
  • Gene Dosage
  • Gene Expression
  • Gene Expression Profiling / methods*
  • Gene Frequency
  • Genetic Predisposition to Disease*
  • Humans
  • Mice
  • Oligonucleotide Array Sequence Analysis
  • Penetrance*
  • Predictive Value of Tests
  • Saccharomyces cerevisiae / genetics
  • Sample Size
  • Species Specificity