The homologous recombination repair (HRR) pathway repairs DNA double-strand breaks in an error-free manner. Mutations in HRR genes can result in increased mutation rate and genomic rearrangements, and are associated with numerous genetic disorders and cancer. Despite intensive research, the HRR pathway is not yet fully mapped. Phylogenetic profiling analysis, which detects functional linkage between genes using coevolution, is a powerful approach to identify factors in many pathways. Nevertheless, phylogenetic profiling has limited predictive power when analyzing pathways with complex evolutionary dynamics such as the HRR. To map novel HRR genes systematically, we developed clade phylogenetic profiling (CladePP). CladePP detects local coevolution across hundreds of genomes and points to the evolutionary scale (e.g., mammals, vertebrates, animals, plants) at which coevolution occurred. We found that multiscale coevolution analysis is significantly more biologically relevant and sensitive to detect gene function. By using CladePP, we identified dozens of unrecognized genes that coevolved with the HRR pathway, either globally across all eukaryotes or locally in different clades. We validated eight genes in functional biological assays to have a role in DNA repair at both the cellular and organismal levels. These genes are expected to play a role in the HRR pathway and might lead to a better understanding of missing heredity in HRR-associated cancers (e.g., heredity breast and ovarian cancer). Our platform presents an innovative approach to predict gene function, identify novel factors related to different diseases and pathways, and characterize gene evolution.
© 2019 Sherill-Rofe et al.; Published by Cold Spring Harbor Laboratory Press.