Dissecting heritability, environmental risk, and air pollution causal effects using > 50 million individuals in MarketScan

Nat Commun. 2024 Jun 25;15(1):5357. doi: 10.1038/s41467-024-49566-6.

Abstract

Large national-level electronic health record (EHR) datasets offer new opportunities for disentangling the role of genes and environment through deep phenotype information and approximate pedigree structures. Here we use the approximate geographical locations of patients as a proxy for spatially correlated community-level environmental risk factors. We develop a spatial mixed linear effect (SMILE) model that incorporates both genetics and environmental contribution. We extract EHR and geographical locations from 257,620 nuclear families and compile 1083 disease outcome measurements from the MarketScan dataset. We augment the EHR with publicly available environmental data, including levels of particulate matter 2.5 (PM2.5), nitrogen dioxide (NO2), climate, and sociodemographic data. We refine the estimates of genetic heritability and quantify community-level environmental contributions. We also use wind speed and direction as instrumental variables to assess the causal effects of air pollution. In total, we find PM2.5 or NO2 have statistically significant causal effects on 135 diseases, including respiratory, musculoskeletal, digestive, metabolic, and sleep disorders, where PM2.5 and NO2 tend to affect biologically distinct disease categories. These analyses showcase several robust strategies for jointly modeling genetic and environmental effects on disease risk using large EHR datasets and will benefit upcoming biobank studies in the era of precision medicine.

MeSH terms

  • Adult
  • Air Pollutants / adverse effects
  • Air Pollutants / analysis
  • Air Pollutants / toxicity
  • Air Pollution* / adverse effects
  • Electronic Health Records
  • Environmental Exposure / adverse effects
  • Female
  • Gene-Environment Interaction
  • Genetic Predisposition to Disease
  • Humans
  • Male
  • Middle Aged
  • Nitrogen Dioxide* / adverse effects
  • Nitrogen Dioxide* / analysis
  • Particulate Matter* / adverse effects
  • Risk Factors

Substances

  • Particulate Matter
  • Nitrogen Dioxide
  • Air Pollutants