Large genomic region free of GWAS-based common variants contains fertility-related genes

PLoS One. 2013 Apr 17;8(4):e61917. doi: 10.1371/journal.pone.0061917. Print 2013.

Abstract

DNA variants, such as single nucleotide polymorphisms (SNPs) and copy number variants (CNVs), are unevenly distributed across the human genome. Currently, dbSNP contains more than 6 million human SNPs, and whole-genome genotyping arrays can assay more than 4 million of them simultaneously. In our study, we first questioned whether published genome-wide association studies (GWASs) assays cover all regions well in the genome. Using dbSNP build 135 data, we identified 50 genomic regions longer than 100 Kb that do not contain any common SNPs, i.e., those with minor allele frequency (MAF)≥ 1%. Secondly, because conserved regions are generally of functional importance, we tested genes in those large genomic regions without common SNPs. We found 97 genes and were enriched for reproduction function. In addition, we further filtered out regions with CNVs listed in the Database of Genomic Variants (DGV), segmental duplications from Human Genome Project and common variants identified by personal genome sequencing (UCSC). No region survived after those filtering. Our analysis suggests that, while there may not be many large genomic regions free of common variants, there are still some "holes" in the current human genomic map for common SNPs. Because GWAS only focused on common SNPs, interpretation of GWAS results should take this limitation into account. Particularly, two recent GWAS of fertility may be incomplete due to the map deficit. Additional SNP discovery efforts should pay close attention to these regions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Genetic
  • Fertility / genetics*
  • Genes*
  • Genome, Human / genetics*
  • Genome-Wide Association Study*
  • Humans
  • Isochores
  • Molecular Sequence Annotation
  • Polymorphism, Single Nucleotide / genetics*
  • Segmental Duplications, Genomic / genetics
  • Sequence Analysis, DNA
  • Software

Substances

  • Isochores

Grants and funding

This study was supported by National Basic Research Program (973 Program) (No. 2012CB944601, 2012CB517902, to Hong Jiang), New Century Excellent Talents in University (No. NCET-10-0836, to Hong Jiang), National Natural Science Foundation of China (No. 61125301 to Min Wu; No. 30971585, 30871354, 30710303061, 30400262, 81271260, to Hong Jiang). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.