TY - JOUR
T1 - Accuracy of the PHQ-2 Alone and in Combination With the PHQ-9 for Screening to Detect Major Depression
T2 - Systematic Review and Meta-analysis
AU - Depression Screening Data (DEPRESSD) PHQ Collaboration
AU - Levis, Brooke
AU - Sun, Ying
AU - He, Chen
AU - Wu, Yin
AU - Krishnan, Ankur
AU - Bhandari, Parash Mani
AU - Neupane, Dipika
AU - Imran, Mahrukh
AU - Brehaut, Eliana
AU - Negeri, Zelalem
AU - Fischer, Felix H
AU - Benedetti, Andrea
AU - Thombs, Brett D
AU - Che, Liying
AU - Levis, Alexander
AU - Riehm, Kira
AU - Saadat, Nazanin
AU - Azar, Marleine
AU - Rice, Danielle
AU - Boruff, Jill
AU - Kloda, Lorie
AU - Cuijpers, Pim
AU - Gilbody, Simon
AU - Ioannidis, John
AU - McMillan, Dean
AU - Patten, Scott
AU - Shrier, Ian
AU - Ziegelstein, Roy
AU - Moore, Ainsley
AU - Akena, Dickens
AU - Amtmann, Dagmar
AU - Arroll, Bruce
AU - Ayalon, Liat
AU - Baradaran, Hamid
AU - Beraldi, Anna
AU - Bernstein, Charles
AU - Bhana, Arvin
AU - Bombardier, Charles
AU - Buji, Ryna Imma
AU - Butterworth, Peter
AU - Carter, Gregory
AU - Chagas, Marcos
AU - Chan, Juliana
AU - Chan, Lai Fong
AU - Chibanda, Dixon
AU - Cholera, Rushina
AU - Clover, Kerrie
AU - Conway, Aaron
AU - Delgadillo, Jaime
AU - van der Feltz-Cornelis, Christina
N1 - © 2020 American Medical Association. This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details.
PY - 2020/6/9
Y1 - 2020/6/9
N2 - Importance: The Patient Health Questionnaire depression module (PHQ-9) is a 9-item self-administered instrument used for detecting depression and assessing severity of depression. The Patient Health Questionnaire-2 (PHQ-2) consists of the first 2 items of the PHQ-9 (which assess the frequency of depressed mood and anhedonia) and can be used as a first step to identify patients for evaluation with the full PHQ-9.Objective: To estimate PHQ-2 accuracy alone and combined with the PHQ-9 for detecting major depression.Data Sources: MEDLINE, MEDLINE In-Process & Other Non-Indexed Citations, PsycINFO, and Web of Science (January 2000-May 2018).Study Selection: Eligible data sets compared PHQ-2 scores with major depression diagnoses from a validated diagnostic interview.Data Extraction and Synthesis: Individual participant data were synthesized with bivariate random-effects meta-analysis to estimate pooled sensitivity and specificity of the PHQ-2 alone among studies using semistructured, fully structured, or Mini International Neuropsychiatric Interview (MINI) diagnostic interviews separately and in combination with the PHQ-9 vs the PHQ-9 alone for studies that used semistructured interviews. The PHQ-2 score ranges from 0 to 6, and the PHQ-9 score ranges from 0 to 27.Results: Individual participant data were obtained from 100 of 136 eligible studies (44 318 participants; 4572 with major depression [10%]; mean [SD] age, 49 [17] years; 59% female). Among studies that used semistructured interviews, PHQ-2 sensitivity and specificity (95% CI) were 0.91 (0.88-0.94) and 0.67 (0.64-0.71) for cutoff scores of 2 or greater and 0.72 (0.67-0.77) and 0.85 (0.83-0.87) for cutoff scores of 3 or greater. Sensitivity was significantly greater for semistructured vs fully structured interviews. Specificity was not significantly different across the types of interviews. The area under the receiver operating characteristic curve was 0.88 (0.86-0.89) for semistructured interviews, 0.82 (0.81-0.84) for fully structured interviews, and 0.87 (0.85-0.88) for the MINI. There were no significant subgroup differences. For semistructured interviews, sensitivity for PHQ-2 scores of 2 or greater followed by PHQ-9 scores of 10 or greater (0.82 [0.76-0.86]) was not significantly different than PHQ-9 scores of 10 or greater alone (0.86 [0.80-0.90]); specificity for the combination was significantly but minimally higher (0.87 [0.84-0.89] vs 0.85 [0.82-0.87]). The area under the curve was 0.90 (0.89-0.91). The combination was estimated to reduce the number of participants needing to complete the full PHQ-9 by 57% (56%-58%).Conclusions and Relevance: In an individual participant data meta-analysis of studies that compared PHQ scores with major depression diagnoses, the combination of PHQ-2 (with cutoff ≥2) followed by PHQ-9 (with cutoff ≥10) had similar sensitivity but higher specificity compared with PHQ-9 cutoff scores of 10 or greater alone. Further research is needed to understand the clinical and research value of this combined approach to screening.
AB - Importance: The Patient Health Questionnaire depression module (PHQ-9) is a 9-item self-administered instrument used for detecting depression and assessing severity of depression. The Patient Health Questionnaire-2 (PHQ-2) consists of the first 2 items of the PHQ-9 (which assess the frequency of depressed mood and anhedonia) and can be used as a first step to identify patients for evaluation with the full PHQ-9.Objective: To estimate PHQ-2 accuracy alone and combined with the PHQ-9 for detecting major depression.Data Sources: MEDLINE, MEDLINE In-Process & Other Non-Indexed Citations, PsycINFO, and Web of Science (January 2000-May 2018).Study Selection: Eligible data sets compared PHQ-2 scores with major depression diagnoses from a validated diagnostic interview.Data Extraction and Synthesis: Individual participant data were synthesized with bivariate random-effects meta-analysis to estimate pooled sensitivity and specificity of the PHQ-2 alone among studies using semistructured, fully structured, or Mini International Neuropsychiatric Interview (MINI) diagnostic interviews separately and in combination with the PHQ-9 vs the PHQ-9 alone for studies that used semistructured interviews. The PHQ-2 score ranges from 0 to 6, and the PHQ-9 score ranges from 0 to 27.Results: Individual participant data were obtained from 100 of 136 eligible studies (44 318 participants; 4572 with major depression [10%]; mean [SD] age, 49 [17] years; 59% female). Among studies that used semistructured interviews, PHQ-2 sensitivity and specificity (95% CI) were 0.91 (0.88-0.94) and 0.67 (0.64-0.71) for cutoff scores of 2 or greater and 0.72 (0.67-0.77) and 0.85 (0.83-0.87) for cutoff scores of 3 or greater. Sensitivity was significantly greater for semistructured vs fully structured interviews. Specificity was not significantly different across the types of interviews. The area under the receiver operating characteristic curve was 0.88 (0.86-0.89) for semistructured interviews, 0.82 (0.81-0.84) for fully structured interviews, and 0.87 (0.85-0.88) for the MINI. There were no significant subgroup differences. For semistructured interviews, sensitivity for PHQ-2 scores of 2 or greater followed by PHQ-9 scores of 10 or greater (0.82 [0.76-0.86]) was not significantly different than PHQ-9 scores of 10 or greater alone (0.86 [0.80-0.90]); specificity for the combination was significantly but minimally higher (0.87 [0.84-0.89] vs 0.85 [0.82-0.87]). The area under the curve was 0.90 (0.89-0.91). The combination was estimated to reduce the number of participants needing to complete the full PHQ-9 by 57% (56%-58%).Conclusions and Relevance: In an individual participant data meta-analysis of studies that compared PHQ scores with major depression diagnoses, the combination of PHQ-2 (with cutoff ≥2) followed by PHQ-9 (with cutoff ≥10) had similar sensitivity but higher specificity compared with PHQ-9 cutoff scores of 10 or greater alone. Further research is needed to understand the clinical and research value of this combined approach to screening.
KW - Adult
KW - Depressive Disorder, Major/classification
KW - Female
KW - Humans
KW - Interviews as Topic
KW - Male
KW - Mass Screening/methods
KW - Patient Health Questionnaire
KW - ROC Curve
KW - Sensitivity and Specificity
U2 - 10.1001/jama.2020.6504
DO - 10.1001/jama.2020.6504
M3 - Article
C2 - 32515813
SN - 0098-7484
VL - 323
SP - 2290
EP - 2300
JO - JAMA
JF - JAMA
IS - 22
ER -