Contemporary environmental health sciences draw on large-scale longitudinal studies to understand the impact of environmental exposures and behavior factors on the risk of disease and identify potential underlying mechanisms. In such studies, cohorts of individuals are assembled and followed up over time. Each cohort generates hundreds of publications, which are typically neither coherently organized nor summarized, hence limiting knowledge-driven dissemination. Hence, we propose a Cohort Network, a multilayer knowledge graph approach to extract exposures, outcomes, and their connections. We applied the Cohort Network on 121 peer-reviewed papers published over the past 10 years from the Veterans Affairs (VA) Normative Aging Study (NAS). The Cohort Network visualized connections between exposures and outcomes across different publications and identified key exposures and outcomes, such as air pollution, DNA methylation, and lung function. We demonstrated the utility of the Cohort Network for new hypothesis generation, e.g., identification of potential mediators of exposure-outcome associations. The Cohort Network can be used by investigators to summarize the cohort's research and facilitate knowledge-driven discovery and dissemination.
Keywords: Cohort Network; cohort study; hypothesis generation; knowledge graph; network analysis.