Detecting rare and common haplotype-environment interaction under uncertainty of gene-environment independence assumption

Biometrics. 2017 Mar;73(1):344-355. doi: 10.1111/biom.12567. Epub 2016 Aug 1.

Abstract

Finding rare variants and gene-environment interactions (GXE) is critical in dissecting complex diseases. We consider the problem of detecting GXE where G is a rare haplotype and E is a nongenetic factor. Such methods typically assume G-E independence, which may not hold in many applications. A pertinent example is lung cancer-there is evidence that variants on Chromosome 15q25.1 interact with smoking to affect the risk. However, these variants are associated with smoking behavior rendering the assumption of G-E independence inappropriate. With the motivation of detecting GXE under G-E dependence, we extend an existing approach, logistic Bayesian LASSO, which assumes G-E independence (LBL-GXE-I) by modeling G-E dependence through a multinomial logistic regression (referred to as LBL-GXE-D). Unlike LBL-GXE-I, LBL-GXE-D controls type I error rates in all situations; however, it has reduced power when G-E independence holds. To control type I error without sacrificing power, we further propose a unified approach, LBL-GXE, to incorporate uncertainty in the G-E independence assumption by employing a reversible jump Markov chain Monte Carlo method. Our simulations show that LBL-GXE has power similar to that of LBL-GXE-I when G-E independence holds, yet has well-controlled type I errors in all situations. To illustrate the utility of LBL-GXE, we analyzed a lung cancer dataset and found several significant interactions in the 15q25.1 region, including one between a specific rare haplotype and smoking.

Keywords: G-E dependence; GXE; LBL; Missing heritability; Rare variants; Reversible jump MCMC.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Biometry / methods*
  • Chromosomes, Human, Pair 15
  • Computer Simulation
  • Data Interpretation, Statistical
  • Gene-Environment Interaction*
  • Genetic Variation
  • Haplotypes
  • Humans
  • Logistic Models
  • Lung Neoplasms / etiology*
  • Lung Neoplasms / genetics
  • Models, Genetic
  • Risk
  • Smoking / adverse effects*
  • Smoking / genetics