The correlation between CpG methylation and gene expression is driven by sequence variants

Nat Genet. 2024 Aug;56(8):1624-1631. doi: 10.1038/s41588-024-01851-2. Epub 2024 Jul 24.

Abstract

Gene promoter and enhancer sequences are bound by transcription factors and are depleted of methylated CpG sites (cytosines preceding guanines in DNA). The absence of methylated CpGs in these sequences typically correlates with increased gene expression, indicating a regulatory role for methylation. We used nanopore sequencing to determine haplotype-specific methylation rates of 15.3 million CpG units in 7,179 whole-blood genomes. We identified 189,178 methylation depleted sequences where three or more proximal CpGs were unmethylated on at least one haplotype. A total of 77,789 methylation depleted sequences (~41%) associated with 80,503 cis-acting sequence variants, which we termed allele-specific methylation quantitative trait loci (ASM-QTLs). RNA sequencing of 896 samples from the same blood draws used to perform nanopore sequencing showed that the ASM-QTL, that is, DNA sequence variability, drives most of the correlation found between gene expression and CpG methylation. ASM-QTLs were enriched 40.2-fold (95% confidence interval 32.2, 49.9) among sequence variants associating with hematological traits, demonstrating that ASM-QTLs are important functional units in the noncoding genome.

MeSH terms

  • Alleles
  • CpG Islands*
  • DNA Methylation*
  • Gene Expression Regulation
  • Genetic Variation
  • Genome, Human
  • Haplotypes
  • Humans
  • Nanopore Sequencing / methods
  • Promoter Regions, Genetic
  • Quantitative Trait Loci*