VIPPID: a gene-specific single nucleotide variant pathogenicity prediction tool for primary immunodeficiency diseases

Brief Bioinform. 2022 Sep 20;23(5):bbac176. doi: 10.1093/bib/bbac176.

Abstract

Distinguishing pathogenic variants from non-pathogenic ones remains a major challenge in clinical genetic testing of primary immunodeficiency (PID) patients. Most of the existing mutation pathogenicity prediction tools treat all mutations as homogeneous entities, ignoring the differences in characteristics of different genes, and use the same model for genes in different diseases. In this study, we developed a single nucleotide variant (SNV) pathogenicity prediction tool, Variant Impact Predictor for PIDs (VIPPID; https://mylab.shinyapps.io/VIPPID/), which was tailored for PIDs genes and used a specific model for each of the most prevalent PID known genes. It employed a Conditional Inference Forest model and utilized information of 85 features of SNVs and scores from 20 existing prediction tools. Evaluation of VIPPID showed that it had superior performance (area under the curve = 0.91) over non-specific conventional tools. In addition, we also showed that the gene-specific model outperformed the non-gene-specific models. Our study demonstrated that disease-specific and gene-specific models can improve SNV pathogenicity prediction performance. This observation supports the notion that each feature of mutations in the model can be potentially used, in a new algorithm, to investigate the characteristics and function of the encoded proteins.

Keywords: computational analysis; genetic mutation; inborn errors of immunity (IEI); machine learning; primary immunodeficiency (PID); variant prediction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Humans
  • Nucleotides
  • Polymorphism, Single Nucleotide*
  • Primary Immunodeficiency Diseases*
  • Virulence

Substances

  • Nucleotides