Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs

Nat Biotechnol. 2004 Jul;22(7):911-7. doi: 10.1038/nbt988.

Abstract

Several widely used methods for predicting functional associations between proteins are based on the systematic analysis of genomic context. Efforts are ongoing to improve these methods and to search for novel aspects in genomes that could be exploited for function prediction. Here, we use gene expression data to demonstrate two functional implications of genome organization: first, chromosomal proximity indicates gene coregulation in prokaryotes independent of relative gene orientation; and second, adjacent bidirectionally transcribed genes (that is,'divergently' organized coding regions) with conserved gene orientation are strongly coregulated. We further demonstrate that such bidirectionally transcribed gene pairs are functionally associated and derive from this a novel genomic context method that reliably predicts links between >2,500 pairs of genes in approximately 100 species. Around 650 of these functional associations are supported by other genomic context methods. In most instances, one gene encodes a transcriptional regulator, and the other a nonregulatory protein. In-depth analysis in Escherichia coli shows that the vast majority of these regulators both control transcription of the divergently transcribed target gene/operon and auto-regulate their own biosynthesis. The method thus enables the prediction of target processes and regulatory features for several hundred transcriptional regulators.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Gene Expression Regulation / genetics
  • Gene Order*
  • Genome, Bacterial
  • Genomics / methods*
  • Molecular Sequence Data
  • Phylogeny
  • Proteins / genetics*
  • Proteins / physiology*
  • Sequence Analysis, Protein / methods
  • Transcription Factors / genetics
  • Transcription Factors / metabolism
  • Transcription, Genetic

Substances

  • Proteins
  • Transcription Factors