Following the publication of the complete human genomic sequence, the post-genomic era is driven by the need to extract useful information from genomic data. Genomics, transcriptomics, proteomics, metabolomics, epidemiological data and microbial data provide different angles to our understanding of gene-environment interactions and the determinants of disease and health. Our goal and our challenge are to integrate these very different types of data and perspectives of disease into a global model suitable for dissecting the mechanisms of disease and for predicting novel therapeutic strategies. This review aims to highlight the need for and problems with complex data integration, and proposes a framework for data integration. While there are many obstacles to overcome, biological models based upon multiple datasets will probably become the basis that drives future biomedical research.