Using interviewer random effects to remove selection bias from HIV prevalence estimates

Mark E McGovern; Till Bärnighausen; Joshua A Salomon; David Canning

doi:10.1186/1471-2288-15-8

Using interviewer random effects to remove selection bias from HIV prevalence estimates

BMC Med Res Methodol. 2015 Feb 5:15:8. doi: 10.1186/1471-2288-15-8.

Authors

Mark E McGovern^{1

2}, Till Bärnighausen^{3

4}, Joshua A Salomon⁵, David Canning^{6

7}

Affiliations

¹ Harvard Center for Population and Development Studies, 9 Bow Street, Cambridge, MA, 02138, USA. mcgovern@hsph.harvard.edu.
² Department of Global Health and Population, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA. mcgovern@hsph.harvard.edu.
³ Department of Global Health and Population, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA. tbaernig@hsph.harvard.edu.
⁴ Wellcome Trust Africa Centre for Health and Population Studies, University of KwaZulu-Natal, Mtubatuba, South Africa. tbaernig@hsph.harvard.edu.
⁵ Department of Global Health and Population, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA. jsalomon@hsph.harvard.edu.
⁶ Harvard Center for Population and Development Studies, 9 Bow Street, Cambridge, MA, 02138, USA. dcanning@hsph.harvard.edu.
⁷ Department of Global Health and Population, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA. dcanning@hsph.harvard.edu.

Abstract

Background: Selection bias in HIV prevalence estimates occurs if non-participation in testing is correlated with HIV status. Longitudinal data suggests that individuals who know or suspect they are HIV positive are less likely to participate in testing in HIV surveys, in which case methods to correct for missing data which are based on imputation and observed characteristics will produce biased results.

Methods: The identity of the HIV survey interviewer is typically associated with HIV testing participation, but is unlikely to be correlated with HIV status. Interviewer identity can thus be used as a selection variable allowing estimation of Heckman-type selection models. These models produce asymptotically unbiased HIV prevalence estimates, even when non-participation is correlated with unobserved characteristics, such as knowledge of HIV status. We introduce a new random effects method to these selection models which overcomes non-convergence caused by collinearity, small sample bias, and incorrect inference in existing approaches. Our method is easy to implement in standard statistical software, and allows the construction of bootstrapped standard errors which adjust for the fact that the relationship between testing and HIV status is uncertain and needs to be estimated.

Results: Using nationally representative data from the Demographic and Health Surveys, we illustrate our approach with new point estimates and confidence intervals (CI) for HIV prevalence among men in Ghana (2003) and Zambia (2007). In Ghana, we find little evidence of selection bias as our selection model gives an HIV prevalence estimate of 1.4% (95% CI 1.2% - 1.6%), compared to 1.6% among those with a valid HIV test. In Zambia, our selection model gives an HIV prevalence estimate of 16.3% (95% CI 11.0% - 18.4%), compared to 12.1% among those with a valid HIV test. Therefore, those who decline to test in Zambia are found to be more likely to be HIV positive.

Conclusions: Our approach corrects for selection bias in HIV prevalence estimates, is possible to implement even when HIV prevalence or non-participation is very high or very low, and provides a practical solution to account for both sampling and parameter uncertainty in the estimation of confidence intervals. The wide confidence intervals estimated in an example with high HIV prevalence indicate that it is difficult to correct statistically for the bias that may occur when a large proportion of people refuse to test.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Adolescent
Adult
Algorithms
Ghana / epidemiology
HIV Infections / diagnosis*
HIV Infections / epidemiology*
Health Surveys / methods*
Health Surveys / statistics & numerical data
Humans
Interviews as Topic / methods*
Mass Screening / methods
Mass Screening / statistics & numerical data
Middle Aged
Models, Statistical
Prevalence
Selection Bias
Young Adult
Zambia / epidemiology

Abstract

Publication types

MeSH terms

Grants and funding