Bayesian models based on test statistics for multiple hypothesis testing problems

Bioinformatics. 2008 Apr 1;24(7):943-9. doi: 10.1093/bioinformatics/btn049. Epub 2008 Feb 1.

Abstract

Motivation: We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool.

Results: Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Gene Expression Profiling / methods*
  • Models, Biological*
  • Models, Statistical*
  • Pattern Recognition, Automated / methods*
  • Research Design*