Functional regression clustering with multiple functional gene expressions

PLoS One. 2024 Nov 25;19(11):e0310991. doi: 10.1371/journal.pone.0310991. eCollection 2024.

Abstract

Gene expression data is often collected in time series experiments, under different experimental conditions. There may be genes that have very different gene expression profiles over time, but that adjust their gene expression patterns in the same way under experimental conditions. Our aim is to develop a method that finds clusters of genes in which the relationship between these temporal gene expression profiles are similar to one another, even if the individual temporal gene expression profiles differ. We propose a K-means-type algorithm in which each cluster is defined by a function-on-function regression model, which, inter alia, allows for multiple functional explanatory variables. We validate this novel approach through extensive simulations and then apply it to identify groups of genes whose diurnal expression pattern is perturbed by the season in a similar way. Our clusters are enriched for genes with similar biological functions, including one cluster enriched in both photosynthesis-related functions and polysomal ribosomes, which shows that our method provides useful and novel biological insights.

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Gene Expression Profiling* / methods
  • Regression Analysis
  • Transcriptome

Grants and funding

This project was funded by the Alan Turing Institute Research Fellowship under EPSRC Research grant (TU/A/000017) to DE; Biotechnology and Biological Sciences Research Council (BBSRC) and Engineering and Physical Sciences Research Council (EPSRC). EPSRC/BBSRC Innovation Fellowship (EP/S001360/1) to DE and SC. ST would like to thank the Isaac Newton Institute for Mathematical Sciences, Cambridge, for support and hospitality during the programme Statistical Scalability where work on this paper was undertaken. This work was supported by EPSRC grant no EP/R014604/1. Engineering and Physical Sciences Research Council (EPSRC): https://www.ukri.org/councils/epsrc/ Alan Turing Institute: https://www.turing.ac.uk/ Biotechnology and Biological Sciences Research Council (BBSRC): https://www.ukri.org/councils/bbsrc/ Isaac Newton Institute for Mathematical Sciences: https://www.newton.ac.uk/ The funders did not play any role in the study design, data collection and analysis, decision to publish or preparation of the manuscript.