Computational prediction and experimental validation of novel Hedgehog-responsive enhancers linked to genes of the Hedgehog pathway

BMC Dev Biol. 2016 Feb 24:16:4. doi: 10.1186/s12861-016-0106-0.

Abstract

Background: The Hedgehog (Hh) signaling pathway, acting through three homologous transcription factors (GLI1, GLI2, GLI3) in vertebrates, plays multiple roles in embryonic organ development and adult tissue homeostasis. At the level of the genome, GLI factors bind to specific motifs in enhancers, some of which are hundreds of kilobases removed from the gene promoter. These enhancers integrate the Hh signal in a context-specific manner to control the spatiotemporal pattern of target gene expression. Importantly, a number of genes that encode Hh pathway molecules are themselves targets of Hh signaling, allowing pathway regulation by an intricate balance of feed-back activation and inhibition. However, surprisingly few of the critical enhancer elements that control these pathway target genes have been identified despite the fact that such elements are central determinants of Hh signaling activity. Recently, ChIP studies have been carried out in multiple tissue contexts using mouse models carrying FLAG-tagged GLI proteins (GLI(FLAG)). Using these datasets, we tested whether a meta-analysis of GLI binding sites, coupled with a machine learning approach, could reveal genomic features that could be used to empirically identify Hh-regulated enhancers linked to loci of the Hh signaling pathway.

Results: A meta-analysis of four existing GLI(FLAG) datasets revealed a library of GLI binding motifs that was substantially more restricted than the potential sites predicted by previous in vitro binding studies. A machine learning method (kmer-SVM) was then applied to these datasets and enriched k-mers were identified that, when applied to the mouse genome, predicted as many as 37,000 potential Hh enhancers. For functional analysis, we selected nine regions which were annotated to putative Hh pathway molecules and found that seven exhibited GLI-dependent activity, indicating that they are directly regulated by Hh signaling (78% success rate).

Conclusions: The results suggest that Hh enhancer regions share common sequence features. The kmer-SVM machine learning approach identifies those features and can successfully predict functional Hh regulatory regions in genomic DNA surrounding Hh pathway molecules and likely, other Hh targets. Additionally, the library of enriched GLI binding motifs that we have identified may allow improved identification of functional GLI binding sites.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Base Sequence
  • Cell Line
  • Computational Biology / methods*
  • Enhancer Elements, Genetic / genetics*
  • Hedgehog Proteins / genetics*
  • Hedgehog Proteins / metabolism
  • Mice, Inbred C57BL
  • Molecular Sequence Data
  • Nucleotide Motifs / genetics
  • Oncogene Proteins / metabolism
  • Protein Binding
  • Reproducibility of Results
  • Signal Transduction / genetics*
  • Support Vector Machine
  • Trans-Activators / metabolism
  • Transcription Factors / metabolism
  • Zinc Finger Protein GLI1

Substances

  • Hedgehog Proteins
  • Oncogene Proteins
  • Trans-Activators
  • Transcription Factors
  • Zinc Finger Protein GLI1