Familial long-read sequencing increases yield of de novo mutations

Michelle D Noyes; William T Harvey; David Porubsky; Arvis Sulovari; Ruiyang Li; Nicholas R Rose; Peter A Audano; Katherine M Munson; Alexandra P Lewis; Kendra Hoekzema; Tuomo Mantere; Tina A Graves-Lindsay; Ashley D Sanders; Sara Goodwin; Melissa Kramer; Younes Mokrab; Michael C Zody; Alexander Hoischen; Jan O Korbel; W Richard McCombie; Evan E Eichler

doi:10.1016/j.ajhg.2022.02.014

Familial long-read sequencing increases yield of de novo mutations

Am J Hum Genet. 2022 Apr 7;109(4):631-646. doi: 10.1016/j.ajhg.2022.02.014. Epub 2022 Mar 14.

Authors

Michelle D Noyes¹, William T Harvey¹, David Porubsky¹, Arvis Sulovari¹, Ruiyang Li¹, Nicholas R Rose¹, Peter A Audano¹, Katherine M Munson¹, Alexandra P Lewis¹, Kendra Hoekzema¹, Tuomo Mantere², Tina A Graves-Lindsay³, Ashley D Sanders⁴, Sara Goodwin⁵, Melissa Kramer⁵, Younes Mokrab⁶, Michael C Zody⁷, Alexander Hoischen⁸, Jan O Korbel⁴, W Richard McCombie⁵, Evan E Eichler⁹

Affiliations

¹ Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA.
² Department of Human Genetics, Radboud University Medical Center, 6500 Nijmegen, the Netherlands; Laboratory of Cancer Genetics and Tumor Biology, Cancer and Translational Medicine Research Unit and Biocenter Oulu, University of Oulu, 90220 Oulu, Finland.
³ McDonnell Genome Institute, Washington University, St. Louis, MO 63108, USA.
⁴ European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany.
⁵ Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.
⁶ Department of Human Genetics, Sidra Medicine, PO Box 26999, Doha, Qatar; Weill Cornell Medicine, PO Box 24144, Doha, Qatar; College of Health and Life Sciences, Hamad Bin Khalifa University, PO Box 34110, Doha, Qatar.
⁷ New York Genome Center, New York, NY 10013, USA.
⁸ Department of Human Genetics, Radboud University Medical Center, 6500 Nijmegen, the Netherlands; Radboud Institute of Medical Life Sciences and Department of Internal Medicine and Radboud Center for Infectious Diseases, Radboud University Medical Center, 6500 Nijmegen, the Netherlands.
⁹ Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA. Electronic address: eee@gs.washington.edu.

Abstract

Studies of de novo mutation (DNM) have typically excluded some of the most repetitive and complex regions of the genome because these regions cannot be unambiguously mapped with short-read sequencing data. To better understand the genome-wide pattern of DNM, we generated long-read sequence data from an autism parent-child quad with an affected female where no pathogenic variant had been discovered in short-read Illumina sequence data. We deeply sequenced all four individuals by using three sequencing platforms (Illumina, Oxford Nanopore, and Pacific Biosciences) and three complementary technologies (Strand-seq, optical mapping, and 10X Genomics). Using long-read sequencing, we initially discovered and validated 171 DNMs across two children-a 20% increase in the number of de novo single-nucleotide variants (SNVs) and indels when compared to short-read callsets. The number of DNMs further increased by 5% when considering a more complete human reference (T2T-CHM13) because of the recovery of events in regions absent from GRCh38 (e.g., three DNMs in heterochromatic satellites). In total, we validated 195 de novo germline mutations and 23 potential post-zygotic mosaic mutations across both children; the overall true substitution rate based on this integrated callset is at least 1.41 × 10^-8 substitutions per nucleotide per generation. We also identified six de novo insertions and deletions in tandem repeats, two of which represent structural variants. We demonstrate that long-read sequencing and assembly, especially when combined with a more complete reference genome, increases the number of DNMs by >25% compared to previous studies, providing a more complete catalog of DNM compared to short-read data alone.

Keywords: autism; de novo mutation; genome sequencing; long-read sequencing.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Female
Genomics*
High-Throughput Nucleotide Sequencing*
Humans
Mutation / genetics
Nucleotides
Sequence Analysis, DNA
Software

Substances

Nucleotides

Abstract

Publication types

MeSH terms

Substances

Grants and funding