Accurate molecular classification of human cancers based on gene expression using a simple classifier with a pathological tree-based framework

Kerby A Shedden; Jeremy M G Taylor; Thomas J Giordano; Rork Kuick; David E Misek; Gad Rennert; Donald R Schwartz; Stephen B Gruber; Craig Logsdon; Diane Simeone; Sharon L R Kardia; Joel K Greenson; Kathleen R Cho; David G Beer; Eric R Fearon; Samir Hanash

doi:10.1016/S0002-9440(10)63557-2

Accurate molecular classification of human cancers based on gene expression using a simple classifier with a pathological tree-based framework

Am J Pathol. 2003 Nov;163(5):1985-95. doi: 10.1016/S0002-9440(10)63557-2.

Authors

Kerby A Shedden¹, Jeremy M G Taylor, Thomas J Giordano, Rork Kuick, David E Misek, Gad Rennert, Donald R Schwartz, Stephen B Gruber, Craig Logsdon, Diane Simeone, Sharon L R Kardia, Joel K Greenson, Kathleen R Cho, David G Beer, Eric R Fearon, Samir Hanash

Affiliation

¹ Department of Statistics, University of Michigan, Ann Arbor, MI 48109-1027, USA. kshedden@umich.edu

Abstract

Recent studies suggest accurate prediction of tissue of origin for human cancers can be achieved by applying sophisticated statistical learning procedures to gene expression data obtained from DNA microarrays. We have pursued the hypothesis that a more straightforward and equally accurate strategy for classifying human tumors is to use a simple algorithm that considers gene expression levels within a tree-based framework that encodes limited information about pathology and tissue ontogeny. By considering gene expression data within this framework, we found only a small number of genes were required to achieve a relatively high accuracy level in tumor classification. Using as few as 45 genes we were able to classify 157 of 190 human malignant tumors correctly, which is comparable to previous results obtained with sophisticated classifiers using thousands of genes. Our simple classifier accurately predicted the origin of metastatic tumors even when the classifier was trained using only primary tumors, and the classifier produced accurate predictions when trained and tested on expression data from different labs, and from different microarray platforms. Our findings suggest that accurate and robust cancer diagnosis from gene expression profiles can be achieved by mimicking the classification strategies routinely used by surgical pathologists.

Publication types

Comparative Study

MeSH terms

Algorithms
Biomarkers, Tumor
Gene Expression Profiling / methods*
Gene Expression Profiling / statistics & numerical data
Humans
Neoplasms / classification*
Neoplasms / diagnosis
Neoplasms / genetics
Observer Variation
Oligonucleotide Array Sequence Analysis / methods*
Oligonucleotide Array Sequence Analysis / statistics & numerical data
Sensitivity and Specificity

Substances

Biomarkers, Tumor

Abstract

Publication types

MeSH terms

Substances

Grants and funding