Exploiting single-cell expression to characterize co-expression replicability

Megan Crow; Anirban Paul; Sara Ballouz; Z Josh Huang; Jesse Gillis

doi:10.1186/s13059-016-0964-6

Exploiting single-cell expression to characterize co-expression replicability

Genome Biol. 2016 May 6:17:101. doi: 10.1186/s13059-016-0964-6.

Authors

Megan Crow¹, Anirban Paul¹, Sara Ballouz¹, Z Josh Huang¹, Jesse Gillis²

Affiliations

¹ Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring, Harbor, NY, 11724, USA.
² Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring, Harbor, NY, 11724, USA. jgillis@cshl.edu.

Abstract

Background: Co-expression networks have been a useful tool for functional genomics, providing important clues about the cellular and biochemical mechanisms that are active in normal and disease processes. However, co-expression analysis is often treated as a black box with results being hard to trace to their basis in the data. Here, we use both published and novel single-cell RNA sequencing (RNA-seq) data to understand fundamental drivers of gene-gene connectivity and replicability in co-expression networks.

Results: We perform the first major analysis of single-cell co-expression, sampling from 31 individual studies. Using neighbor voting in cross-validation, we find that single-cell network connectivity is less likely to overlap with known functions than co-expression derived from bulk data, with functional variation within cell types strongly resembling that also occurring across cell types. To identify features and analysis practices that contribute to this connectivity, we perform our own single-cell RNA-seq experiment of 126 cortical interneurons in an experimental design targeted to co-expression. By assessing network replicability, semantic similarity and overall functional connectivity, we identify technical factors influencing co-expression and suggest how they can be controlled for. Many of the technical effects we identify are expression-level dependent, making expression level itself highly predictive of network topology. We show this occurs generally through re-analysis of the BrainSpan RNA-seq data.

Conclusions: Technical properties of single-cell RNA-seq data create confounds in co-expression networks which can be identified and explicitly controlled for in any supervised analysis. This is useful both in improving co-expression performance and in characterizing single-cell data in generally applicable terms, permitting cross-laboratory comparison within a common framework.

Keywords: Autism; Brain; Co-expression; Interneuron; Meta-analysis; Network; Normalization; RNA-seq; Single cell.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Animals
Cell Separation / methods
Gene Expression Profiling / methods*
Gene Expression Profiling / standards
Gene Regulatory Networks
Mice
Reproducibility of Results
Sequence Analysis, RNA / methods*
Sequence Analysis, RNA / standards
Single-Cell Analysis / methods*
Single-Cell Analysis / standards

Abstract

Publication types

MeSH terms

Grants and funding