Our ability to determine the clinical impact of variants in 3' untranslated regions (UTRs) of genes remains poor. We provide a thorough analysis of 3' UTR variants from several datasets. Variants in putative regulatory elements, including RNA-binding protein motifs, eCLIP peaks, and microRNA sites, are up to 16 times more likely than variants not in these elements to have gene expression and phenotype associations. Variants in regulatory motifs result in allele-specific protein binding in cell lines and allele-specific gene expression differences in population studies. In addition, variants in shared regions of alternatively polyadenylated isoforms and those proximal to polyA sites are more likely to affect gene expression and phenotype. Finally, pathogenic 3' UTR variants in ClinVar are up to 20 times more likely than benign variants to fall in a regulatory site. We incorporated these findings into RegVar, a software tool that interprets regulatory elements and annotations for any 3' UTR variant and predicts whether the variant is likely to affect gene expression or phenotype. This tool will help prioritize variants for experimental studies and identify pathogenic variants in individuals.
Keywords: 3ʹ UTR; RNA-binding proteins; genetic variants; miRNA; polyadenylation; regulatory motifs; variant interpretation.
Copyright © 2023 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.