Antonio Reis de Sá-Junior, Vitor A Petrilli-Mazon, Ione Schneider, Yuan-Pang Wang, Cesar Oliveira
{"title":"Reliability, Item Functioning, and Gender Bias of the CES-D Scale in Community-Dwelling Elderly: Findings from the ELSA Cohort.","authors":"Antonio Reis de Sá-Junior, Vitor A Petrilli-Mazon, Ione Schneider, Yuan-Pang Wang, Cesar Oliveira","doi":"10.47626/1516-4446-2025-4401","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To evaluate the item performance of the Center for Epidemiologic Studies Depression (CES-D) scale.</p><p><strong>Methods: </strong>Participants were adults aged 50 and older from the English Longitudinal Study of Ageing (ELSA). Using classical test theory and item response theory, data from 11,612 participants were analyzed to estimate reliability, item discrimination (a), and item difficulty (b). Differential Item Functioning (DIF) analyses assessed whether individuals from different gender groups responded differently to items despite similar depressive symptom levels.</p><p><strong>Results: </strong>The CES-D demonstrated adequate internal consistency (α = 0.80; ω = 0.85), with a lower marginal reliability (0,65). Around 60% of participants endorsed at least one depressive symptom. All items showed moderate to higher levels of discrimination (a > 0.66), with \"slept restlessly\" most frequently endorsed (b = 0.43), and \"felt lonely\" the hardest to endorse (b = 1.59). Four items - \"slept restlessly\", \"felt lonely\", \"felt sad\", and \"could not get going\" - exhibited significant DIF, with women more likely to endorse these items than men at equivalent symptom levels.</p><p><strong>Conclusions: </strong>CES-D items showed acceptable reliability and effectively captured varying depression severity. Despite some DIF, no substantial gender-related measurement bias was found, supporting the scale's use for screening in older adult populations.</p>","PeriodicalId":520767,"journal":{"name":"Revista brasileira de psiquiatria (Sao Paulo, Brazil : 1999)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista brasileira de psiquiatria (Sao Paulo, Brazil : 1999)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47626/1516-4446-2025-4401","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: To evaluate the item performance of the Center for Epidemiologic Studies Depression (CES-D) scale.
Methods: Participants were adults aged 50 and older from the English Longitudinal Study of Ageing (ELSA). Using classical test theory and item response theory, data from 11,612 participants were analyzed to estimate reliability, item discrimination (a), and item difficulty (b). Differential Item Functioning (DIF) analyses assessed whether individuals from different gender groups responded differently to items despite similar depressive symptom levels.
Results: The CES-D demonstrated adequate internal consistency (α = 0.80; ω = 0.85), with a lower marginal reliability (0,65). Around 60% of participants endorsed at least one depressive symptom. All items showed moderate to higher levels of discrimination (a > 0.66), with "slept restlessly" most frequently endorsed (b = 0.43), and "felt lonely" the hardest to endorse (b = 1.59). Four items - "slept restlessly", "felt lonely", "felt sad", and "could not get going" - exhibited significant DIF, with women more likely to endorse these items than men at equivalent symptom levels.
Conclusions: CES-D items showed acceptable reliability and effectively captured varying depression severity. Despite some DIF, no substantial gender-related measurement bias was found, supporting the scale's use for screening in older adult populations.