Co-localization between Sequence Constraint and Epigenomic Information Improves Interpretation of Whole-Genome Sequencing Data

Am J Hum Genet. 2020 Apr 2;106(4):513-524. doi: 10.1016/j.ajhg.2020.03.003.

Abstract

The identification of functional regions in the noncoding human genome is difficult but critical in order to gain understanding of the role noncoding variation plays in gene regulation in human health and disease. We describe here a co-localization approach that aims to identify constrained sequences that co-localize with tissue- or cell-type-specific regulatory regions, and we show that the resulting score is particularly well suited for the identification of rare regulatory variants. For 127 tissues and cell types in the ENCODE/Roadmap Epigenomics Project, we provide catalogs of putative tissue- or cell-type-specific regulatory regions under sequence constraint. We use the newly developed co-localization score for brain tissues to score de novo mutations in whole genomes from 1,902 individuals affected with autism spectrum disorder (ASD) and their unaffected siblings in the Simons Simplex Collection. We show that noncoding de novo mutations near genes co-expressed in midfetal brain with high confidence ASD risk genes, and near FMRP gene targets are more likely to be in co-localized regions if they occur in ASD probands versus in their unaffected siblings. We also observed a similar enrichment for mutations near lincRNAs, previously shown to co-express with ASD risk genes. Additionally, we provide strong evidence that prioritized de novo mutations in autism probands point to a small set of well-known ASD genes, the disruption of which produces relevant mouse phenotypes such as abnormal social investigation and abnormal discrimination/associative learning, unlike the de novo mutations in unaffected siblings. The genome-wide co-localization results are available online.

Keywords: colocalization; epigenomic annotations; sequence constraint; whole-genome sequencing studies.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Autism Spectrum Disorder / genetics
  • Epigenomics / methods
  • Gene Expression Regulation / genetics*
  • Genome, Human / genetics*
  • Humans
  • Mutation / genetics
  • Phenotype
  • Siblings
  • Whole Genome Sequencing / methods