Accuracy of Patient Health Questionnaire-9 (PHQ-9) for screening to detect major depression: individual participant data meta-analysis

BMJ. 2019 Apr 9:365:l1476. doi: 10.1136/bmj.l1476.

Abstract

Objective: To determine the accuracy of the Patient Health Questionnaire-9 (PHQ-9) for screening to detect major depression.

Design: Individual participant data meta-analysis.

Data sources: Medline, Medline In-Process and Other Non-Indexed Citations, PsycINFO, and Web of Science (January 2000-February 2015).

Inclusion criteria: Eligible studies compared PHQ-9 scores with major depression diagnoses from validated diagnostic interviews. Primary study data and study level data extracted from primary reports were synthesized. For PHQ-9 cut-off scores 5-15, bivariate random effects meta-analysis was used to estimate pooled sensitivity and specificity, separately, among studies that used semistructured diagnostic interviews, which are designed for administration by clinicians; fully structured interviews, which are designed for lay administration; and the Mini International Neuropsychiatric (MINI) diagnostic interviews, a brief fully structured interview. Sensitivity and specificity were examined among participant subgroups and, separately, using meta-regression, considering all subgroup variables in a single model.

Results: Data were obtained for 58 of 72 eligible studies (total n=17 357; major depression cases n=2312). Combined sensitivity and specificity was maximized at a cut-off score of 10 or above among studies using a semistructured interview (29 studies, 6725 participants; sensitivity 0.88, 95% confidence interval 0.83 to 0.92; specificity 0.85, 0.82 to 0.88). Across cut-off scores 5-15, sensitivity with semistructured interviews was 5-22% higher than for fully structured interviews (MINI excluded; 14 studies, 7680 participants) and 2-15% higher than for the MINI (15 studies, 2952 participants). Specificity was similar across diagnostic interviews. The PHQ-9 seems to be similarly sensitive but may be less specific for younger patients than for older patients; a cut-off score of 10 or above can be used regardless of age..

Conclusions: PHQ-9 sensitivity compared with semistructured diagnostic interviews was greater than in previous conventional meta-analyses that combined reference standards. A cut-off score of 10 or above maximized combined sensitivity and specificity overall and for subgroups.

Registration: PROSPERO CRD42014010673.

Publication types

  • Comparative Study
  • Meta-Analysis
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Systematic Review

MeSH terms

  • Aged
  • Data Accuracy*
  • Depressive Disorder, Major / diagnosis*
  • Depressive Disorder, Major / epidemiology
  • Female
  • Humans
  • Interview, Psychological / methods
  • Male
  • Mass Screening / methods*
  • Middle Aged
  • Patient Health Questionnaire / statistics & numerical data*
  • Psychiatric Status Rating Scales / standards
  • Sensitivity and Specificity