Quantification of DNA patchiness using long-range correlation measures

Biophys J. 1997 Feb;72(2 Pt 1):866-75. doi: 10.1016/s0006-3495(97)78721-6.

Abstract

We introduce and develop new techniques to quantify DNA patchiness, and to quantify characteristics of its mosaic structure. These techniques, which involve calculating two functions, alpha(l) and beta(l), measure correlations at length scale l and detect distinct characteristic patch sizes embedded in scale-invariant patch size distributions. Using these new methods, we address a number of issues relating to the mosaic structure of genomic DNA. We find several distinct characteristic patch sizes in certain genomic sequences, and compare, contrast, and quantify the correlation properties of different sequences, including a number of yeast, human, and prokaryotic sequences. We exclude the possibility that the correlation properties and the known mosaic structure of DNA can be explained either by simple Markov processes or by tandem repeats of dinucleotides. We find that the distinct patch sizes in all 16 yeast chromosomes are similar. Furthermore, we test the hypothesis that, for yeast, patchiness is caused by the alternation of coding and noncoding regions, and the hypothesis that in human sequences patchiness is related to repetitive sequences. We find that, by themselves, neither the alternation of coding and noncoding regions, nor repetitive sequences, can fully explain the long-range correlation properties of DNA.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Chromosomes / chemistry
  • Chromosomes, Fungal / chemistry
  • DNA / chemistry*
  • DNA, Fungal / chemistry
  • Humans
  • Markov Chains
  • Repetitive Sequences, Nucleic Acid
  • Saccharomyces cerevisiae / chemistry
  • Sequence Analysis

Substances

  • DNA, Fungal
  • DNA