Two-dimensional segmentation for analyzing Hi-C data

Bioinformatics. 2014 Sep 1;30(17):i386-92. doi: 10.1093/bioinformatics/btu443.

Abstract

Motivation: The spatial conformation of the chromosome has a deep influence on gene regulation and expression. Hi-C technology allows the evaluation of the spatial proximity between any pair of loci along the genome. It results in a data matrix where blocks corresponding to (self-)interacting regions appear. The delimitation of such blocks is critical to better understand the spatial organization of the chromatin. From a computational point of view, it results in a 2D segmentation problem.

Results: We focus on the detection of cis-interacting regions, which appear to be prominent in observed data. We define a block-wise segmentation model for the detection of such regions. We prove that the maximization of the likelihood with respect to the block boundaries can be rephrased in terms of a 1D segmentation problem, for which the standard dynamic programming applies. The performance of the proposed methods is assessed by a simulation study on both synthetic and resampled data. A comparative study on public data shows good concordance with biologically confirmed regions.

Availability and implementation: The HiCseg R package is available from the Comprehensive R Archive Network and from the Web page of the corresponding author.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Chromatin / chemistry
  • Chromosomes, Human / chemistry
  • Chromosomes, Mammalian / chemistry*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Mice
  • Models, Statistical
  • Sequence Analysis, DNA

Substances

  • Chromatin