Accurate age prediction from blood using a small set of DNA methylation sites and a cohort-based machine learning algorithm

Cell Rep Methods. 2023 Sep 25;3(9):100567. doi: 10.1016/j.crmeth.2023.100567. Epub 2023 Aug 28.

Abstract

Chronological age prediction from DNA methylation sheds light on human aging, health, and lifespan. Current clocks are mostly based on linear models and rely upon hundreds of sites across the genome. Here, we present GP-age, an epigenetic non-linear cohort-based clock for blood, based upon 11,910 methylomes. Using 30 CpG sites alone, GP-age outperforms state-of-the-art models, with a median accuracy of ∼2 years on held-out blood samples, for both array and sequencing-based data. We show that aging-related changes occur at multiple neighboring CpGs, with implications for using fragment-level analysis of sequencing data in aging research. By training three independent clocks, we show enrichment of donors with consistent deviation between predicted and actual age, suggesting individual rates of biological aging. Overall, we provide a compact yet accurate alternative to array-based clocks for blood, with applications in longitudinal aging research, forensic profiling, and monitoring epigenetic processes in transplantation medicine and cancer.

Keywords: CP: Genetics; CP: Systems biology; DNA methylation; aging; computational biology; epigenetics; machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aging* / genetics
  • Algorithms
  • Base Sequence
  • Child, Preschool
  • DNA Methylation* / genetics
  • Epigenesis, Genetic
  • Humans