Background: Breast cancer is a heterogeneous disease, presenting with a wide range of histologic, clinical, and genetic features. Microarray technology has shown promise in predicting outcome in these patients.
Methods: We profiled 162 breast tumors using expression microarrays to stratify tumors based on gene expression. A subset of 55 tumors with extensive follow-up was used to identify gene sets that predicted outcome. The predictive gene set was further tested in previously published data sets.
Results: We used different statistical methods to identify three gene sets associated with disease free survival. A fourth gene set, consisting of 21 genes in common to all three sets, also had the ability to predict patient outcome. To validate the predictive utility of this derived gene set, it was tested in two published data sets from other groups. This gene set resulted in significant separation of patients on the basis of survival in these data sets, correctly predicting outcome in 62-65% of patients. By comparing outcome prediction within subgroups based on ER status, grade, and nodal status, we found that our gene set was most effective in predicting outcome in ER positive and node negative tumors.
Conclusion: This robust gene selection with extensive validation has identified a predictive gene set that may have clinical utility for outcome prediction in breast cancer patients.