Deconvolution of heterogeneous tumor samples using partial reference signals

PLoS Comput Biol. 2020 Nov 30;16(11):e1008452. doi: 10.1371/journal.pcbi.1008452. eCollection 2020 Nov.

Abstract

Deconvolution of heterogeneous bulk tumor samples into distinct cellular populations is an important yet challenging problem, particularly when only partial references are available. A common approach to dealing with this problem is to deconvolve the mixed signals using available references and leverage the remaining signal as a new cell component. However, as indicated in our simulation, such an approach tends to over-estimate the proportions of known cell types and fails to detect novel cell types. Here, we propose PREDE, a partial reference-based deconvolution method using an iterative non-negative matrix factorization algorithm. Our method is verified to be effective in estimating cell proportions and expression profiles of unknown cell types based on simulated datasets at a variety of parameter settings. Applying our method to TCGA tumor samples, we found that proportions of pure cancer cells better indicate different subtypes of tumor samples. We also detected several cell types for each cancer type whose proportions successfully predicted patient survival. Our method makes a significant contribution to deconvolution of heterogeneous tumor samples and could be widely applied to varieties of high throughput bulk data. PREDE is implemented in R and is freely available from GitHub (https://xiaoqizheng.github.io/PREDE).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Cell Line, Tumor
  • Computational Biology / methods
  • Gene Expression Profiling / methods
  • Humans
  • Neoplasms / classification
  • Neoplasms / genetics
  • Neoplasms / pathology*
  • Rats
  • Reproducibility of Results

Grants and funding

This work was supported by the National Key R&D Program of China [2018YFA0900600 to X.Z.]; National Natural Science Foundation of China [61902061 to W.Z., 61702325 to Y.Q., 61572327 and 61972257 to X.Z., 11871070 to X.S.]; Natural Science Foundation of Shanghai [20JC1413800 to X.Z.]; Shanghai Science and Technology Innovation Action Plan [16391902900 to Y.Q.]; Science and Technology Research Project of Jiangxi Education Department [GJJ170445 to W.Z.] and Guangdong Basic and Applied Basic Research Foundation (2020B151502120 to X.S.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.