Statistical challenges in preprocessing in microarray experiments in cancer

Kouros Owzar; William T Barry; Sin-Ho Jung; Insuk Sohn; Stephen L George

doi:10.1158/1078-0432.CCR-07-4532

Statistical challenges in preprocessing in microarray experiments in cancer

Clin Cancer Res. 2008 Oct 1;14(19):5959-66. doi: 10.1158/1078-0432.CCR-07-4532.

Authors

Kouros Owzar¹, William T Barry, Sin-Ho Jung, Insuk Sohn, Stephen L George

Affiliation

¹ Department of Biostatistics and Bioinformatics, and Cancer and Leukemia Group B Statistical Center, Duke University School of Medicine, 2424 Erwin Road, Durham, NC 27705, USA. kouros.ozwar@duke.edu

Abstract

Many clinical studies incorporate genomic experiments to investigate the potential associations between high-dimensional molecular data and clinical outcome. A critical first step in the statistical analyses of these experiments is that the molecular data are preprocessed. This article provides an overview of preprocessing methods, including summary algorithms and quality control metrics for microarrays. Some of the ramifications and effects that preprocessing methods have on the statistical results are illustrated. The discussions are centered around a microarray experiment based on lung cancer tumor samples with survival as the clinical outcome of interest. The procedures that are presented focus on the array platform used in this study. However, many of these issues are more general and are applicable to other instruments for genome-wide investigation. The discussions here will provide insight into the statistical challenges in preprocessing microarrays used in clinical studies of cancer. These challenges should not be viewed as inconsequential nuisances but rather as important issues that need to be addressed so that informed conclusions can be drawn.

Publication types

Review

MeSH terms

Algorithms
Biometry / methods*
Computational Biology / methods
Data Interpretation, Statistical
Genomics
Humans
Models, Statistical
Neoplasms / genetics*
Neoplasms / metabolism*
Oligonucleotide Array Sequence Analysis / methods*
Principal Component Analysis
Quality Control
Research Design
Treatment Outcome

Abstract

Publication types

MeSH terms

Grants and funding