Abstract
With the advent of ultra high-throughput sequencing technologies, increasingly researchers are turning to deep sequencing for gene expression studies. Here we present a set of rigorous methods for normalization, quantification of noise, and co-expression analysis of deep sequencing data. Using these methods on 122 cap analysis of gene expression (CAGE) samples of transcription start sites, we construct genome-wide 'promoteromes' in human and mouse consisting of a three-tiered hierarchy of transcription start sites, transcription start clusters, and transcription start regions.
Publication types
-
Research Support, Non-U.S. Gov't
MeSH terms
-
Algorithms
-
Animals
-
Base Composition
-
Cell Line
-
Cluster Analysis
-
Computational Biology / methods
-
CpG Islands / genetics
-
Gene Expression Profiling / methods
-
Gene Expression Profiling / statistics & numerical data*
-
Genome-Wide Association Study / methods
-
Humans
-
Mice
-
Oligonucleotide Array Sequence Analysis / methods
-
Oligonucleotide Array Sequence Analysis / statistics & numerical data*
-
Promoter Regions, Genetic / genetics*
-
Reproducibility of Results
-
Sequence Analysis, DNA / methods
-
Sequence Analysis, DNA / statistics & numerical data*
-
Transcription Initiation Site*