Unbiased analysis of potential targets of breast cancer susceptibility loci by Capture Hi-C

Genome Res. 2014 Nov;24(11):1854-68. doi: 10.1101/gr.175034.114. Epub 2014 Aug 13.

Abstract

Genome-wide association studies have identified more than 70 common variants that are associated with breast cancer risk. Most of these variants map to non-protein-coding regions and several map to gene deserts, regions of several hundred kilobases lacking protein-coding genes. We hypothesized that gene deserts harbor long-range regulatory elements that can physically interact with target genes to influence their expression. To test this, we developed Capture Hi-C (CHi-C), which, by incorporating a sequence capture step into a Hi-C protocol, allows high-resolution analysis of targeted regions of the genome. We used CHi-C to investigate long-range interactions at three breast cancer gene deserts mapping to 2q35, 8q24.21, and 9q31.2. We identified interaction peaks between putative regulatory elements ("bait fragments") within the captured regions and "targets" that included both protein-coding genes and long noncoding (lnc) RNAs over distances of 6.6 kb to 2.6 Mb. Target protein-coding genes were IGFBP5, KLF4, NSMCE2, and MYC; and target lncRNAs included DIRC3, PVT1, and CCDC26. For one gene desert, we were able to define two SNPs (rs12613955 and rs4442975) that were highly correlated with the published risk variant and that mapped within the bait end of an interaction peak. In vivo ChIP-qPCR data show that one of these, rs4442975, affects the binding of FOXA1 and implicate this SNP as a putative functional variant.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms / genetics*
  • Breast Neoplasms / metabolism
  • Breast Neoplasms / pathology
  • Cell Line, Tumor
  • Chromatin Immunoprecipitation
  • Chromosome Mapping
  • Chromosomes, Human, Pair 2 / genetics
  • Chromosomes, Human, Pair 8 / genetics
  • Chromosomes, Human, Pair 9 / genetics
  • Genetic Predisposition to Disease / genetics*
  • Genome, Human / genetics
  • Genome-Wide Association Study / methods*
  • Hepatocyte Nuclear Factor 3-alpha / genetics
  • Hepatocyte Nuclear Factor 3-alpha / metabolism
  • Homeodomain Proteins / genetics
  • Homeodomain Proteins / metabolism
  • Humans
  • Kruppel-Like Factor 4
  • MCF-7 Cells
  • Oligonucleotide Array Sequence Analysis
  • Polymorphism, Single Nucleotide*
  • Protein Binding
  • Protein Interaction Mapping
  • RNA, Long Noncoding / genetics
  • RNA, Long Noncoding / metabolism
  • Real-Time Polymerase Chain Reaction
  • Regulatory Sequences, Nucleic Acid / genetics
  • Reproducibility of Results
  • Sequence Analysis, DNA

Substances

  • FOXA1 protein, human
  • HOXB13 protein, human
  • Hepatocyte Nuclear Factor 3-alpha
  • Homeodomain Proteins
  • KLF4 protein, human
  • Kruppel-Like Factor 4
  • RNA, Long Noncoding

Associated data

  • GEO/GSE55634