Epiclomal: Probabilistic clustering of sparse single-cell DNA methylation data

PLoS Comput Biol. 2020 Sep 23;16(9):e1008270. doi: 10.1371/journal.pcbi.1008270. eCollection 2020 Sep.

Abstract

We present Epiclomal, a probabilistic clustering method arising from a hierarchical mixture model to simultaneously cluster sparse single-cell DNA methylation data and impute missing values. Using synthetic and published single-cell CpG datasets, we show that Epiclomal outperforms non-probabilistic methods and can handle the inherent missing data characteristic that dominates single-cell CpG genome sequences. Using newly generated single-cell 5mCpG sequencing data, we show that Epiclomal discovers sub-clonal methylation patterns in aneuploid tumour genomes, thus defining epiclones that can match or transcend copy number-determined clonal lineages and opening up an important form of clonal analysis in cancer. Epiclomal is written in R and Python and is available at https://github.com/shahcompbio/Epiclomal.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • CpG Islands
  • DNA Methylation*
  • Humans
  • Probability
  • Sequence Analysis, DNA / methods
  • Single-Cell Analysis*