Aims/hypothesis: Transcription factor 7-like 2 (TCF7L2) has been strongly implicated in type 2 diabetes and cancer. Our goal was to identify the DNA sequences bound by this transcription factor in vivo.
Methods: We applied chromatin immunoprecipitation and sequencing to globally identify and map human DNA sequences bound by TCF7L2 in the colorectal carcinoma cell line, HCT116, where it is abundantly expressed.
Results: We identified 1,095 discrete binding sites across the genome, of which a subset were within 5 kb of 548 annotated NCBI Reference Sequence (RefSeq) genes. Despite using a cancer cell line, the most significant functions represented using pathway analysis software were related to diabetes, genetic disorders and coronary artery disease. As one of the enriched categories was related to genetic disorders, we queried our results against all published genome-wide association studies (GWAS) and found a highly significant over-representation of reported loci from among the genes bound by TCF7L2 within 5 kb (p = 7.50 × 10⁻¹⁵). This observation was primarily driven by excess loci revealed from GWAS of metabolic and cardiovascular traits; however, there was no or only minor enrichment of GWAS-derived loci for cancer, and inflammatory or neurological diseases. Of the specific traits, the most enriched loci were for type 2 diabetes and height. When defining the distance from genes at 50 kb or 500 kb, this enrichment pattern persisted, with some additional evidence for enrichment of cancer-related loci.
Conclusions/interpretation: A highly significant proportion of genes bound by TCF7L2 are known disease-associated loci. These findings suggest that TCF7L2 is a central node in the regulation of human diabetes and other disease-associated genes.