Hierarchical Bayes prioritization of marker associations from a genome-wide association scan for further investigation

Genet Epidemiol. 2007 Dec;31(8):871-82. doi: 10.1002/gepi.20248.

Abstract

We describe a hierarchical regression modeling approach to selection of a subset of markers from the first stage of a genomewide association scan to carry forward to subsequent stages for testing on an independent set of subjects. Rather than simply selecting a subset of most significant marker-disease associations at some cutoff chosen to maximize the cost efficiency of a multistage design, we propose a prior model for the true noncentrality parameters of these associations composed of a large mass at zero and a continuous distribution of nonzero values. The prior probability of nonzero values and their prior means can be functions of various covariates characterizing each marker, such as their location relative to genes or evolutionary conserved regions, or prior linkage or association data. We propose to take the top ranked posterior expectations of the noncentrality parameters for confirmation in later stages of a genomewide scan. The statistical performance of this approach is compared with the traditional p-value ranking by simulation studies. We show that the ranking by posterior expectations performs better at selecting the true positive association than a simple ranking of p-values if at least some of the prior covariates have predictive value.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Bayes Theorem
  • Computer Simulation
  • Genetic Diseases, Inborn / genetics
  • Genetic Markers*
  • Genome*
  • Humans
  • Models, Statistical
  • Polymorphism, Single Nucleotide
  • Statistics as Topic*

Substances

  • Genetic Markers