Abstract
A key challenge in analyzing single cell RNA-sequencing data is the large number of false zeros, where genes actually expressed in a given cell are incorrectly measured as unexpressed. We present a method based on low-rank matrix approximation which imputes these values while preserving biologically non-expressed genes (true biological zeros) at zero expression levels. We provide theoretical justification for this denoising approach and demonstrate its advantages relative to other methods on simulated and biological datasets.
© 2022. The Author(s).
Publication types
-
Research Support, N.I.H., Extramural
MeSH terms
-
Algorithms*
-
Animals
-
B-Lymphocytes / cytology
-
B-Lymphocytes / metabolism
-
Bronchi / cytology
-
Bronchi / metabolism
-
Datasets as Topic
-
Epithelial Cells / cytology
-
Epithelial Cells / metabolism
-
Humans
-
Killer Cells, Natural / cytology
-
Killer Cells, Natural / metabolism
-
Mice
-
Monocytes / cytology
-
Monocytes / metabolism
-
Primary Cell Culture
-
RNA / genetics*
-
RNA / metabolism
-
RNA-Seq
-
Sequence Analysis, RNA / statistics & numerical data*
-
Single-Cell Analysis
-
T-Lymphocytes / cytology
-
T-Lymphocytes / metabolism