The human genome encodes for over 1,500 RNA-binding proteins (RBPs), which coordinate regulatory events on RNA transcripts. Most studies of RBPs have concentrated on their action on host protein-encoding mRNAs, which constitute a minority of the transcriptome. A widely neglected subset of our transcriptome derives from integrated retroviral elements, termed endogenous retroviruses (ERVs), that comprise ∼8% of the human genome. Some ERVs have been shown to be transcribed under physiological and pathological conditions, suggesting that sophisticated regulatory mechanisms to coordinate and prevent their ectopic expression exist. However, it is unknown how broadly RBPs and ERV transcripts directly interact to provide a posttranscriptional layer of regulation. Here, we implemented a computational pipeline to determine the correlation of expression between individual RBPs and ERVs from single-cell or bulk RNA-sequencing data. One of our top candidates for an RBP negatively regulating ERV expression was RNA-binding motif protein 4 (RBM4). We used photoactivatable ribonucleoside-enhanced cross-linking and immunoprecipitation to demonstrate that RBM4 indeed bound ERV transcripts at CGG consensus elements. Loss of RBM4 resulted in an elevated transcript level of bound ERVs of the HERV-K and -H families, as well as increased expression of HERV-K envelope protein. We pinpointed RBM4 regulation of HERV-K to a CGG-containing element that is conserved in the LTRs of HERV-K-10, -K-11, and -K-20, and validated the functionality of this site using reporter assays. In summary, we systematically identified RBPs that may regulate ERV function and demonstrate a role for RBM4 in controlling ERV expression.
Keywords: PAR-CLIP; RNA sequencing; RNA-binding proteins; endogenous retroviruses; posttranscriptional regulation.