Adding Highly Variable Genes to Spatially Variable Genes Can Improve Cell Type Clustering Performance in Spatial Transcriptomics Data

Res Sq [Preprint]. 2024 Oct 25:rs.3.rs-5315913. doi: 10.21203/rs.3.rs-5315913/v1.

Abstract

Spatial transcriptomics has allowed researchers to analyze transcriptome data in its tissue sample's spatial context. Various methods have been developed for detecting spatially variable genes (SV genes), whose gene expression over the tissue space shows strong spatial autocorrelation. Such genes are often used to define clusters in cells or spots downstream. However, highly variable (HV) genes, whose quantitative gene expressions show significant variation from cell to cell, are conventionally used in clustering analyses. In this report, we investigate whether adding highly variable genes to spatially variable genes can improve the cell type clustering performance in spatial transcriptomics data. We tested the clustering performance of HV genes, SV genes, and the union of both gene sets (concatenation) on over 50 real spatial transcriptomics datasets across multiple platforms, using a variety of spatial and non-spatial metrics. Our results show that combining HV genes and SV genes can improve overall cell-type clustering performance.

Keywords: clustering; feature selection; spatial transcriptomics.

Publication types

  • Preprint