Machine learning-enhanced noninvasive prenatal testing of monogenic disorders

Prenat Diagn. 2024 Aug;44(9):1024-1032. doi: 10.1002/pd.6570. Epub 2024 Apr 30.

Abstract

Objective: Single-nucleotide variants (SNVs) are of great significance in prenatal diagnosis as they are the leading cause of inherited single-gene disorders (SGDs). Identifying SNVs in a non-invasive prenatal screening (NIPS) scenario is particularly challenging for maternally inherited SNVs. We present an improved method to predict inherited SNVs from maternal or paternal origin in a genome-wide manner.

Methods: We performed SNV-NIPS based on the combination of fragments of cell free DNA (cfDNA) features, Bayesian inference and a machine-learning (ML) prediction refinement step using random forest (RF) classifiers trained on millions of non-pathogenic variants. We next evaluate the real-world performance of our refined method in a clinical setting by testing our models on 16 families with singleton pregnancies and varying fetal fraction (FF) levels, and validate the results over millions of inherited variants in each fetus.

Results: The average area under the ROC curve (AUC) values are 0.996 over all families for paternally inherited variants, 0.81 for the challenging maternally inherited variants, 0.86 for homozygous biallelic variants and 0.95 for compound heterozygous variants. Discriminative AUCs were achieved even in families with a low FF. We further investigate the performance of our method in correctly predicting SNVs in coding regions of clinically relevant genes and demonstrate significantly improved AUCs in these regions. Finally, we focus on the pathogenic variants in our cohort and show that our method correctly predicts if the fetus is unaffected or affected in all (10/10, 100%) of the families containing a pathogenic SNV.

Conclusions: Overall, we demonstrate our ability to perform genome-wide NIPS for maternal and homozygous biallelic variants and showcase the utility of our method in a clinical setting.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Cell-Free Nucleic Acids / analysis
  • Cell-Free Nucleic Acids / blood
  • Cell-Free Nucleic Acids / genetics
  • Female
  • Genetic Diseases, Inborn / diagnosis
  • Genetic Diseases, Inborn / genetics
  • Humans
  • Machine Learning*
  • Noninvasive Prenatal Testing* / methods
  • Polymorphism, Single Nucleotide
  • Pregnancy

Substances

  • Cell-Free Nucleic Acids