Population-based case-control design has become one of the most popular approaches for conducting genome-wide association scans for rare diseases like cancer. In this article, we propose a novel method for improving the power of the widely used single-single-nucleotide polymorphism (SNP) two-degrees-of-freedom (2 d.f.) association test for case-control studies by exploiting the common assumption of Hardy-Weinberg Equilibrium (HWE) for the underlying population. A key feature of the method is that it can relax the assumed model constraints via a completely data-adaptive shrinkage estimation approach so that the number of false-positive results due to the departure of HWE is controlled. The method is computationally simple and is easily scalable to association tests involving hundreds of thousands or millions of genetic markers. Simulation studies as well as an application involving data from a real genome-wide association study illustrate that the proposed method is very robust for large-scale association studies and can improve the power for detecting susceptibility SNPs with recessive effects, when compared to existing methods. Implications of the general estimation strategy beyond the simple 2 d.f. association test are discussed.
2009 Wiley-Liss, Inc.