Inference of expanded Lrp-like feast/famine transcription factor targets in a non-model organism using protein structure-based prediction

PLoS One. 2014 Sep 25;9(9):e107863. doi: 10.1371/journal.pone.0107863. eCollection 2014.

Abstract

Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Archaeal Proteins / chemistry*
  • Archaeal Proteins / genetics
  • Archaeal Proteins / metabolism*
  • Arginine / metabolism
  • Binding Sites
  • Computational Biology / methods*
  • DNA, Archaeal / metabolism
  • Gene Regulatory Networks
  • Halobacterium salinarum / genetics*
  • Halobacterium salinarum / metabolism*
  • Molecular Sequence Data
  • Operon / genetics
  • Protein Binding
  • Pyrimidines / metabolism
  • Regulatory Sequences, Nucleic Acid / genetics
  • Substrate Specificity
  • Transcription Factors / chemistry
  • Transcription Factors / metabolism*

Substances

  • Archaeal Proteins
  • DNA, Archaeal
  • Pyrimidines
  • Transcription Factors
  • Arginine
  • pyrimidine