Zero-preserving imputation of single-cell RNA-seq data

Nat Commun. 2022 Jan 11;13(1):192. doi: 10.1038/s41467-021-27729-z.

Abstract

A key challenge in analyzing single cell RNA-sequencing data is the large number of false zeros, where genes actually expressed in a given cell are incorrectly measured as unexpressed. We present a method based on low-rank matrix approximation which imputes these values while preserving biologically non-expressed genes (true biological zeros) at zero expression levels. We provide theoretical justification for this denoising approach and demonstrate its advantages relative to other methods on simulated and biological datasets.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Animals
  • B-Lymphocytes / cytology
  • B-Lymphocytes / metabolism
  • Bronchi / cytology
  • Bronchi / metabolism
  • Datasets as Topic
  • Epithelial Cells / cytology
  • Epithelial Cells / metabolism
  • Humans
  • Killer Cells, Natural / cytology
  • Killer Cells, Natural / metabolism
  • Mice
  • Monocytes / cytology
  • Monocytes / metabolism
  • Primary Cell Culture
  • RNA / genetics*
  • RNA / metabolism
  • RNA-Seq
  • Sequence Analysis, RNA / statistics & numerical data*
  • Single-Cell Analysis
  • T-Lymphocytes / cytology
  • T-Lymphocytes / metabolism

Substances

  • RNA