Efficient evaluation of ranking procedures when the number of units is large, with application to SNP identification

Thomas A Louis; Ingo Ruczinski

doi:10.1002/bimj.200900044

Efficient evaluation of ranking procedures when the number of units is large, with application to SNP identification

Biom J. 2010 Feb;52(1):34-49. doi: 10.1002/bimj.200900044.

Authors

Thomas A Louis¹, Ingo Ruczinski

Affiliation

¹ Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA. tlouis@jhsph.edu

Abstract

Simulation-based assessment is a popular and frequently necessary approach for evaluating statistical procedures. Sometimes overlooked is the ability to take advantage of underlying mathematical relations and we focus on this aspect. We show how to take advantage of large-sample theory when conducting a simulation using the analysis of genomic data as a motivating example. The approach uses convergence results to provide an approximation to smaller-sample results, results that are available only by simulation. We consider evaluating and comparing various ranking-based methods for identifying the most highly associated SNPs in a genome-wide association study, derive integral equation representations of the pre-posterior distribution of percentiles produced by three ranking methods, and provide examples comparing performance. These results are of interest in their own right and set the framework for a more extensive set of comparisons.

Publication types

Comparative Study
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Genome-Wide Association Study / methods*
Humans
Medical Informatics
Models, Genetic
Polymorphism, Single Nucleotide*

Abstract

Publication types

MeSH terms

Grants and funding