Strand bias in complementary single-nucleotide polymorphisms of transcribed human sequences: evidence for functional effects of synonymous polymorphisms

BMC Genomics. 2006 Aug 17:7:213. doi: 10.1186/1471-2164-7-213.

Abstract

Background: Complementary single-nucleotide polymorphisms (SNPs) may not be distributed equally between two DNA strands if the strands are functionally distinct, such as in transcribed genes. In introns, an excess of A<-->G over the complementary C<-->T substitutions had previously been found and attributed to transcription-coupled repair (TCR), demonstrating the valuable functional clues that can be obtained by studying such asymmetry. Here we studied asymmetry of human synonymous SNPs (sSNPs) in the fourfold degenerate (FFD) sites as compared to intronic SNPs (iSNPs).

Results: The identities of the ancestral bases and the direction of mutations were inferred from human-chimpanzee genomic alignment. After correction for background nucleotide composition, excess of A-->G over the complementary T-->C polymorphisms, which was observed previously and can be explained by TCR, was confirmed in FFD SNPs and iSNPs. However, when SNPs were separately examined according to whether they mapped to a CpG dinucleotide or not, an excess of C-->T over G-->A polymorphisms was found in non-CpG site FFD SNPs but was absent from iSNPs and CpG site FFD SNPs.

Conclusion: The genome-wide discrepancy of human FFD SNPs provides novel evidence for widespread selective pressure due to functional effects of sSNPs. The similar asymmetry pattern of FFD SNPs and iSNPs that map to a CpG can be explained by transcription-coupled mechanisms, including TCR and transcription-coupled mutation. Because of the hypermutability of CpG sites, more CpG site FFD SNPs are relatively younger and have confronted less selection effect than non-CpG FFD SNPs, which can explain the asymmetric discrepancy of CpG site FFD SNPs vs. non-CpG site FFD SNPs.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Base Composition / genetics*
  • Codon / genetics
  • CpG Islands / genetics
  • Databases, Nucleic Acid
  • Genome / genetics
  • Genome, Human / genetics*
  • Humans
  • Introns / genetics
  • Nucleotides / genetics
  • Pan troglodytes / genetics
  • Point Mutation / genetics
  • Polymorphism, Single Nucleotide / genetics*
  • Selection, Genetic
  • Sequence Analysis, DNA / methods

Substances

  • Codon
  • Nucleotides