Robust identification of deletions in exome and genome sequence data based on clustering of Mendelian errors

Hum Mutat. 2018 Jun;39(6):870-881. doi: 10.1002/humu.23419. Epub 2018 Mar 22.

Abstract

Multiple tools have been developed to identify copy number variants (CNVs) from whole exome (WES) and whole genome sequencing (WGS) data. Current tools such as XHMM for WES and CNVnator for WGS identify CNVs based on changes in read depth. For WGS, other methods to identify CNVs include utilizing discordant read pairs and split reads and genome-wide local assembly with tools such as Lumpy and SvABA, respectively. Here, we introduce a new method to identify deletion CNVs from WES and WGS trio data based on the clustering of Mendelian errors (MEs). Using our Mendelian Error Method (MEM), we identified 127 deletions (inherited and de novo) in 2,601 WES trios from the Pediatric Cardiac Genomics Consortium, with a validation rate of 88% by digital droplet PCR. MEM identified additional de novo deletions compared with XHMM, and a significant enrichment of 15q11.2 deletions compared with controls. In addition, MEM identified eight cases of uniparental disomy, sample switches, and DNA contamination. We applied MEM to WGS data from the Genome In A Bottle Ashkenazi trio and identified deletions with 97% specificity. MEM provides a robust, computationally inexpensive method for identifying deletions, and an orthogonal approach for verifying deletions called by other tools.

Keywords: UPD; copy number variant identification; whole exome sequencing; whole genome sequencing.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Chromosome Mapping
  • DNA Copy Number Variations / genetics*
  • DNA Mutational Analysis / methods*
  • Exome / genetics
  • Exome Sequencing
  • Female
  • Genome, Human / genetics*
  • Heart Defects, Congenital / genetics
  • Humans
  • Male
  • Sequence Deletion / genetics*
  • Whole Genome Sequencing