David Byrne , Fiona Boland , Susan Brannick , Robert M. Carney , Pim Cuijpers , Alexandra L. Dima , Kenneth E. Freedland , Suzanne Guerin , David Hevey , Bishember Kathuria , Emma Wallace , Frank Doyle
{"title":"应用先进的心理测量方法产生不同的随机试验效应大小:使用汉密尔顿抑郁症评定量表对抗抑郁药物研究的个体参与者数据进行二次分析。","authors":"David Byrne , Fiona Boland , Susan Brannick , Robert M. Carney , Pim Cuijpers , Alexandra L. Dima , Kenneth E. Freedland , Suzanne Guerin , David Hevey , Bishember Kathuria , Emma Wallace , Frank Doyle","doi":"10.1016/j.jclinepi.2025.111762","DOIUrl":null,"url":null,"abstract":"<div><h3>Objectives</h3><div>As multiple sophisticated techniques are used to evaluate psychometric scales, in theory reducing error and enhancing the measurement of patient-reported outcomes, we aimed to determine whether applying different psychometric analyses would demonstrate important differences in treatment effects.</div></div><div><h3>Study Design and Setting</h3><div>We conducted a secondary analysis of individual participant data (IPD) from 20 antidepressant treatment trials obtained from Vivli.org (<em>n</em> = 6843). Pooled item-level data from the Hamilton Rating Scale for Depression (HRSD-17) were analyzed using confirmatory factory analysis (CFA), item response theory (IRT), and network analysis (NA). Multilevel models were used to analyze differences in trial effects at approximately 8 weeks (range 4–12 weeks) post-treatment commencement, with standardized mean differences calculated as Cohen's d. The effect size outcomes for the original total depression scores were compared with psychometrically informed outcomes based on abbreviated and weighted depression scores.</div></div><div><h3>Results</h3><div>Several items performed poorly during psychometric analyses and were eliminated, resulting in different models being obtained for each approach. Treatment effects were modified as follows per psychometric approach: 10.4%–14.9% increase for CFA, 0%–2.9% increase for IRT, and 14.9%–16.4% reduction for NA.</div></div><div><h3>Conclusion</h3><div>Psychometric analyses differentially moderate effect size outcomes depending on the method used. In a 20-trial sample, factor analytic approaches increased treatment effect sizes relative to the original outcomes, NA decreased them, and IRT results reflected original trial outcomes.</div></div><div><h3>Plain Language Summary</h3><div>This study aimed to determine if using advanced psychometrics methods would inform any clinically or statistically important differences in clinical trial outcomes when compared to original findings. We applied factor analysis (FA), item response theory (IRT), and network analysis (NA) to the most commonly used measure of depression in clinical settings – the Hamilton Rating Scale for Depression (HRSD) – to identify and remove nonperforming survey items and calculate weighted item scores. We found that the efficacy reported in trials increased when using FA to removed items, but decreased when using NA. There was almost no change in efficacy when using IRT. Using weighted scores based on respective models offered no additional utility in terms of increasing or decreasing efficacy outcomes.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"183 ","pages":"Article 111762"},"PeriodicalIF":7.3000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Applying advanced psychometric approaches yields differential randomized trial effect sizes: secondary analysis of individual participant data from antidepressant studies using the Hamilton rating scale for depression\",\"authors\":\"David Byrne , Fiona Boland , Susan Brannick , Robert M. Carney , Pim Cuijpers , Alexandra L. Dima , Kenneth E. Freedland , Suzanne Guerin , David Hevey , Bishember Kathuria , Emma Wallace , Frank Doyle\",\"doi\":\"10.1016/j.jclinepi.2025.111762\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objectives</h3><div>As multiple sophisticated techniques are used to evaluate psychometric scales, in theory reducing error and enhancing the measurement of patient-reported outcomes, we aimed to determine whether applying different psychometric analyses would demonstrate important differences in treatment effects.</div></div><div><h3>Study Design and Setting</h3><div>We conducted a secondary analysis of individual participant data (IPD) from 20 antidepressant treatment trials obtained from Vivli.org (<em>n</em> = 6843). Pooled item-level data from the Hamilton Rating Scale for Depression (HRSD-17) were analyzed using confirmatory factory analysis (CFA), item response theory (IRT), and network analysis (NA). Multilevel models were used to analyze differences in trial effects at approximately 8 weeks (range 4–12 weeks) post-treatment commencement, with standardized mean differences calculated as Cohen's d. The effect size outcomes for the original total depression scores were compared with psychometrically informed outcomes based on abbreviated and weighted depression scores.</div></div><div><h3>Results</h3><div>Several items performed poorly during psychometric analyses and were eliminated, resulting in different models being obtained for each approach. Treatment effects were modified as follows per psychometric approach: 10.4%–14.9% increase for CFA, 0%–2.9% increase for IRT, and 14.9%–16.4% reduction for NA.</div></div><div><h3>Conclusion</h3><div>Psychometric analyses differentially moderate effect size outcomes depending on the method used. In a 20-trial sample, factor analytic approaches increased treatment effect sizes relative to the original outcomes, NA decreased them, and IRT results reflected original trial outcomes.</div></div><div><h3>Plain Language Summary</h3><div>This study aimed to determine if using advanced psychometrics methods would inform any clinically or statistically important differences in clinical trial outcomes when compared to original findings. We applied factor analysis (FA), item response theory (IRT), and network analysis (NA) to the most commonly used measure of depression in clinical settings – the Hamilton Rating Scale for Depression (HRSD) – to identify and remove nonperforming survey items and calculate weighted item scores. We found that the efficacy reported in trials increased when using FA to removed items, but decreased when using NA. There was almost no change in efficacy when using IRT. Using weighted scores based on respective models offered no additional utility in terms of increasing or decreasing efficacy outcomes.</div></div>\",\"PeriodicalId\":51079,\"journal\":{\"name\":\"Journal of Clinical Epidemiology\",\"volume\":\"183 \",\"pages\":\"Article 111762\"},\"PeriodicalIF\":7.3000,\"publicationDate\":\"2025-03-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Clinical Epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0895435625000952\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895435625000952","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
Applying advanced psychometric approaches yields differential randomized trial effect sizes: secondary analysis of individual participant data from antidepressant studies using the Hamilton rating scale for depression
Objectives
As multiple sophisticated techniques are used to evaluate psychometric scales, in theory reducing error and enhancing the measurement of patient-reported outcomes, we aimed to determine whether applying different psychometric analyses would demonstrate important differences in treatment effects.
Study Design and Setting
We conducted a secondary analysis of individual participant data (IPD) from 20 antidepressant treatment trials obtained from Vivli.org (n = 6843). Pooled item-level data from the Hamilton Rating Scale for Depression (HRSD-17) were analyzed using confirmatory factory analysis (CFA), item response theory (IRT), and network analysis (NA). Multilevel models were used to analyze differences in trial effects at approximately 8 weeks (range 4–12 weeks) post-treatment commencement, with standardized mean differences calculated as Cohen's d. The effect size outcomes for the original total depression scores were compared with psychometrically informed outcomes based on abbreviated and weighted depression scores.
Results
Several items performed poorly during psychometric analyses and were eliminated, resulting in different models being obtained for each approach. Treatment effects were modified as follows per psychometric approach: 10.4%–14.9% increase for CFA, 0%–2.9% increase for IRT, and 14.9%–16.4% reduction for NA.
Conclusion
Psychometric analyses differentially moderate effect size outcomes depending on the method used. In a 20-trial sample, factor analytic approaches increased treatment effect sizes relative to the original outcomes, NA decreased them, and IRT results reflected original trial outcomes.
Plain Language Summary
This study aimed to determine if using advanced psychometrics methods would inform any clinically or statistically important differences in clinical trial outcomes when compared to original findings. We applied factor analysis (FA), item response theory (IRT), and network analysis (NA) to the most commonly used measure of depression in clinical settings – the Hamilton Rating Scale for Depression (HRSD) – to identify and remove nonperforming survey items and calculate weighted item scores. We found that the efficacy reported in trials increased when using FA to removed items, but decreased when using NA. There was almost no change in efficacy when using IRT. Using weighted scores based on respective models offered no additional utility in terms of increasing or decreasing efficacy outcomes.
期刊介绍:
The Journal of Clinical Epidemiology strives to enhance the quality of clinical and patient-oriented healthcare research by advancing and applying innovative methods in conducting, presenting, synthesizing, disseminating, and translating research results into optimal clinical practice. Special emphasis is placed on training new generations of scientists and clinical practice leaders.