Complex structural variants in Mendelian disorders: identification and breakpoint resolution using short- and long-read genome sequencing

Genome Med. 2018 Dec 7;10(1):95. doi: 10.1186/s13073-018-0606-6.

Abstract

Background: Studies have shown that complex structural variants (cxSVs) contribute to human genomic variation and can cause Mendelian disease. We aimed to identify cxSVs relevant to Mendelian disease using short-read whole-genome sequencing (WGS), resolve the precise variant configuration and investigate possible mechanisms of cxSV formation.

Methods: We performed short-read WGS and analysis of breakpoint junctions to identify cxSVs in a cohort of 1324 undiagnosed rare disease patients. Long-read WGS and gene expression analysis were used to resolve one case.

Results: We identified three pathogenic cxSVs: a de novo duplication-inversion-inversion-deletion affecting ARID1B, a de novo deletion-inversion-duplication affecting HNRNPU and a homozygous deletion-inversion-deletion affecting CEP78. Additionally, a de novo duplication-inversion-duplication overlapping CDKL5 was resolved by long-read WGS demonstrating the presence of both a disrupted and an intact copy of CDKL5 on the same allele, and gene expression analysis showed both parental alleles of CDKL5 were expressed. Breakpoint analysis in all the cxSVs revealed both microhomology and longer repetitive elements.

Conclusions: Our results corroborate that cxSVs cause Mendelian disease, and we recommend their consideration during clinical investigations. We show that resolution of breakpoints can be critical to interpret pathogenicity and present evidence of replication-based mechanisms in cxSV formation.

Keywords: ARID1B; CDKL5; CEP78; Complex structural variant; Genome sequencing; HNRNPU; Nanopore; Next-generation sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Cycle Proteins / genetics
  • DNA-Binding Proteins / genetics
  • Female
  • Genetic Predisposition to Disease
  • Genome, Human*
  • Genomic Structural Variation*
  • Heterogeneous-Nuclear Ribonucleoprotein U / genetics
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Male
  • Mutation
  • Protein Serine-Threonine Kinases / genetics
  • Sequence Analysis, DNA
  • Transcription Factors / genetics

Substances

  • ARID1B protein, human
  • CEP78 protein, human
  • Cell Cycle Proteins
  • DNA-Binding Proteins
  • HNRNPU protein, human
  • Heterogeneous-Nuclear Ribonucleoprotein U
  • Transcription Factors
  • Protein Serine-Threonine Kinases
  • CDKL5 protein, human