While randomized trials remain the best evidence for treatment effectiveness, lack of generalizability often remains an important concern. Additionally, when new treatments are compared against existing standards of care, the potentially small benefit of the new treatment may be difficult to detect in a trial without extremely large sample sizes and long follow-up times. Recent advances in "data fusion" provide a framework to combine results across studies that are applicable to a given population of interest and allow treatment comparisons that may not be feasible with traditional study designs. We propose a data fusion-based estimator that can be used to combine information from two studies: (1) a study comparing a new treatment to the standard of care in the local population of interest, and (2) a study comparing the standard of care to placebo in a separate, distal population. We provide conditions under which the parameter of interest can be identified from the two studies described and explore properties of the estimator through simulation. Finally, we apply the estimator to estimate the effect of triple- vs monotherapy for the treatment of HIV using data from two randomized trials. The proposed estimator can account for underlying population structures that induce differences in case mix, adherence, and outcome prevalence between the local and distal populations, and the estimator can also account for potentially informative loss to follow-up. Approaches like those detailed here are increasingly important to speed the approval and adoption of effective new therapies by leveraging multiple sources of information.
Keywords: causal inference; data fusion; generalizability; randomized controlled trials; transportability.
© 2021 John Wiley & Sons Ltd.