Quantification of Phenotype Information Aids the Identification of Novel Disease Genes

Hum Mutat. 2017 May;38(5):594-599. doi: 10.1002/humu.23176. Epub 2017 Feb 2.

Abstract

Next-generation sequencing led to the identification of many potential novel disease genes. The presence of mutations in the same gene in multiple unrelated patients is, however, a priori insufficient to establish that these genes are truly involved in the respective disease. Here, we show how phenotype information can be incorporated within statistical approaches to provide additional evidence for the causality of mutations. We developed a broadly applicable statistical model that integrates gene-specific mutation rates, cohort size, mutation type, and phenotype frequency information to assess the chance of identifying de novo mutations affecting the same gene in multiple patients with shared phenotype features. We demonstrate our approach based on the frequency of phenotype features present in a unique cohort of 6,149 patients with intellectual disability. We show that our combined approach can decrease the number of patients required to identify novel disease genes, especially for patients with combinations of rare phenotypes. In conclusion, we show how integrating genotype-phenotype information can aid significantly in the interpretation of de novo mutations in potential novel disease genes.

Keywords: de novo mutations; intellectual disability; patient cohorts; phenotype features; statistical approach; systematic phenotyping.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genetic Association Studies* / methods
  • Genetic Predisposition to Disease*
  • Genotype
  • Humans
  • Models, Genetic
  • Models, Statistical
  • Phenotype*
  • Reproducibility of Results