Lessons from the DREAM2 Challenges

Gustavo Stolovitzky; Robert J Prill; Andrea Califano

doi:10.1111/j.1749-6632.2009.04497.x

Lessons from the DREAM2 Challenges

Ann N Y Acad Sci. 2009 Mar:1158:159-95. doi: 10.1111/j.1749-6632.2009.04497.x.

Authors

Gustavo Stolovitzky¹, Robert J Prill, Andrea Califano

Affiliation

¹ IBM Computational Biology Center, Yorktown Heights, New York, USA. gustavo@us.ibm.com

PMID: 19348640
DOI: 10.1111/j.1749-6632.2009.04497.x

Abstract

Regardless of how creative, innovative, and elegant our computational methods, the ultimate proof of an algorithm's worth is the experimentally validated quality of its predictions. Unfortunately, this truism is hard to reduce to practice. Usually, modelers produce hundreds to hundreds of thousands of predictions, most (if not all) of which go untested. In a best-case scenario, a small subsample of predictions (three to ten usually) is experimentally validated, as a quality control step to attest to the global soundness of the full set of predictions. However, whether this small set is even representative of the global algorithm's performance is a question usually left unaddressed. Thus, a clear understanding of the strengths and weaknesses of an algorithm most often remains elusive, especially to the experimental biologists who must decide which tool to use to address a specific problem. In this chapter, we describe the first systematic set of challenges posed to the systems biology community in the framework of the DREAM (Dialogue for Reverse Engineering Assessments and Methods) project. These tests, which came to be known as the DREAM2 challenges, consist of data generously donated by participants to the DREAM project and curated in such a way as to become problems of network reconstruction and whose solutions, the actual networks behind the data, were withheld from the participants. The explanation of the resulting five challenges, a global comparison of the submissions, and a discussion of the best performing strategies are the main topics discussed.

MeSH terms

Algorithms*
Cluster Analysis
Computational Biology / methods*
Gene Expression Profiling / methods
Gene Regulatory Networks*
Humans
Models, Biological*
Oligonucleotide Array Sequence Analysis / methods
Protein Interaction Mapping
Proto-Oncogene Proteins / genetics
Repressor Proteins / genetics
Reproducibility of Results
Software
Systems Biology*
Two-Hybrid System Techniques

Substances

BCOR protein, human
Proto-Oncogene Proteins
Repressor Proteins