ObjectiveWe aimed to collect and meta-analyse the existing evidence regarding the performance of the Center for Epidemiologic Studies Depression (CES-D) for detecting depression in general population and primary care settings.MethodSystematic literature search in PubMed and PsychINFO. Eligible studies were: a) validation studies of screening questionnaires with information on the accuracy of the CES-D; b) samples from general populations or primary care settings; c) standardized diagnostic interviews following standard classification systems used as gold standard; and d) English or Spanish language of publication. Pooled sensitivity, specificity, likelihood ratios and diagnostic odds ratio were estimated for several cut-off points using bivariate mixed effects models for each threshold. The summary receiver operating characteristic curve was estimated with Rutter and Gatsonis mixed effects models; area under the curve was calculated. Quality of the studies was assessed with the QUADAS tool. Causes of heterogeneity were evaluated with the Rutter and Gatsonis mixed effects model including each covariate at a time.Results28 studies (10,617 participants) met eligibility criteria. The median prevalence of Major Depression was 8.8% (IQ range from 3.8% to 12.6%). The overall area under the curve was 0.87. At the cut-off 16, sensitivity was 0.87 (95% CI: 0.82–0.92), specificity 0.70 (95% CI: 0.65–0.75), and DOR 16.2 (95% CI: 10.49–25.10). Better trade-offs between sensitivity and specificity were observed (Sensitivity = 0.83, Specificity = 0.78, diagnostic odds ratio = 16.64) for cut-off 20. None of the variables assessed as possible sources of heterogeneity was found to be statistically significant.ConclusionThe CES-D has acceptable screening accuracy in the general population or primary care settings, but it should not be used as an isolated diagnostic measure of depression. Depending on the test objectives, the cut-off 20 may be more adequate than the value of 16, which is typically recommended.
The performance of parameter estimates and standard errors in estimating F. Samejima's graded response model was examined across 324 conditions. Full information maximum likelihood (FIML) was compared with a 3-stage estimator for categorical item factor analysis (CIFA) when the unweighted least squares method was used in CIFA's third stage. CIFA is much faster in estimating multidimensional models, particularly with correlated dimensions. Overall, CIFA yields slightly more accurate parameter estimates, and FIML yields slightly more accurate standard errors. Yet, across most conditions, differences between methods are negligible. FIML is the best election in small sample sizes (200 observations). CIFA is the best election in larger samples (on computational grounds). Both methods failed in a number of conditions, most of which involved 200 observations, few indicators per dimension, highly skewed items, or low factor loadings. These conditions are to be avoided in applications.
The SF-12 yielded acceptable results for detecting both active and recent depressive disorders in general population samples, suggesting that the questionnaire could be used as a useful screening tool for monitoring the prevalence of affective disorders and for targeting treatment and prevention.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.