Towards a standard benchmark for variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework

Yasemin Bridges; Vinicius de Souza; Katherina G Cortes; Melissa Haendel; Nomi L Harris; Daniel R Korn; Nikolaos M Marinakis; Nicolas Matentzoglu; James A McLaughlin; Christopher J Mungall; David Osumi-Sutherland; Peter N Robinson; Damian Smedley; Julius Ob Jacobsen

doi:10.1101/2024.06.13.598672

Towards a standard benchmark for variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework

bioRxiv [Preprint]. 2024 Jun 16:2024.06.13.598672. doi: 10.1101/2024.06.13.598672.

Authors

Affiliations

¹ William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK.
² European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK.
³ School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
⁴ Department of Genetics, University of North Carolina, Chapel Hill, Chapel Hill, NC, 27599, USA.
⁵ Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.
⁶ Laboratory of Medical Genetics, National and Kapodistrian University of Athens, Athens, 11527, Greece.
⁷ Semanticly, Athens, 10563, Greece.
⁸ Samples, Phenotypes, and Ontologies (SPOT), European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK.
⁹ Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK.
¹⁰ Berlin Institute of Health, Charité - Universitätsmedizin Berlin, Berlin, 10117, Germany.

Abstract

Background: Computational approaches to support rare disease diagnosis are challenging to build, requiring the integration of complex data types such as ontologies, gene-to-phenotype associations, and cross-species data into variant and gene prioritisation algorithms (VGPAs). However, the performance of VGPAs has been difficult to measure and is impacted by many factors, for example, ontology structure, annotation completeness or changes to the underlying algorithm. Assertions of the capabilities of VGPAs are often not reproducible, in part because there is no standardised, empirical framework and openly available patient data to assess the efficacy of VGPAs - ultimately hindering the development of effective prioritisation tools.

Results: In this paper, we present our benchmarking tool, PhEval, which aims to provide a standardised and empirical framework to evaluate phenotype-driven VGPAs. The inclusion of standardised test corpora and test corpus generation tools in the PhEval suite of tools allows open benchmarking and comparison of methods on standardised data sets.

Conclusions: PhEval and the standardised test corpora solve the issues of patient data availability and experimental tooling configuration when benchmarking and comparing rare disease VGPAs. By providing standardised data on patient cohorts from real-world case-reports and controlling the configuration of evaluated VGPAs, PhEval enables transparent, portable, comparable and reproducible benchmarking of VGPAs. As these tools are often a key component of many rare disease diagnostic pipelines, a thorough and standardised method of assessment is essential for improving patient diagnosis and care.

Publication types

Preprint

Abstract

Publication types

Grants and funding