The proportional hazards (PH) model is commonly used in epidemiology despite the stringent assumption of proportionality of hazards over time. We previously showed, using detailed simulation data, that the impact of a modest risk factor cannot be estimated reliably using the PH model in the presence of confounding by a strong, time-dependent risk factor. Here, we examine the same and related issues using a real dataset. Among 97,303 women in the prospective Nurses' Health Study cohort from 1994 through 2010, we used PH regression to investigate how effect estimates for cigarette smoking are affected by increasingly detailed specification of time-dependent exposure characteristics. We also examined how effect estimates for fine particulate matter (PM2.5), a modest risk factor, are affected by finer control for time-dependent confounding by smoking. The objective of this analysis is not to present a credible estimate of the impact of PM2.5 on lung cancer risk, but to show that estimates based on the PH model are inherently unreliable. The best-fitting model for cigarette smoking and lung cancer included pack-years, duration, time since cessation, and an age-by-pack-years interaction, indicating that the hazard ratio (HR) for pack-years was significantly modified by age. In the fully adjusted best-fitting model for smoking including pack-years, the HR per 10-µg/m3 increase in PM2.5 was 1.06 (95% confidence interval (CI) = 0.90, 1.25); the HR for PM2.5 in the full cohort ranged between 1.02 and 1.10 in models with other smoking adjustments, indicating a residual confounding effect of smoking. The HR for PM2.5 was statistically significant only among former smokers when adjusting for smoking pack-years (HR = 1.35, 95% CI = 1.00, 1.82 in the best-fitting smoking model), but not in models adjusting for smoking duration and average packs (pack-years divided by duration). The association between cumulative smoking and lung cancer is modified by age, and improved model fit is obtained by including multiple time-varying components of smoking history. The association with PM2.5 is residually confounded by smoking and modified by smoking status. These findings underscore limitations of the PH model and emphasize the advantages of directly estimating hazard functions to characterize time-varying exposure and risk. The hazard function, not the relative hazard, is the fundamental measure of risk in a population. As a consequence, the use of time-dependent PH models does not address crucial issues introduced by temporal factors in epidemiological data.
Keywords: Air pollution; Cox proportional hazards regression; confounding; effect modification; epidemiology; lung cancer; particulate matter; smoking.