Quality Assurance of Depression Ratings in Psychiatric Clinical Trials.

IF 2.9 3区医学 Q2 PHARMACOLOGY & PHARMACY

Journal of Clinical Psychopharmacology Pub Date : 2025-01-01 Epub Date: 2024-11-21 DOI:10.1097/JCP.0000000000001936

Michael T Sapko, Cortney Kolesar, Ian R Sharp, Jonathan C Javitt

{"title":"Quality Assurance of Depression Ratings in Psychiatric Clinical Trials.","authors":"Michael T Sapko, Cortney Kolesar, Ian R Sharp, Jonathan C Javitt","doi":"10.1097/JCP.0000000000001936","DOIUrl":null,"url":null,"abstract":"Background: Extensive experience with antidepressant clinical trials indicates that interrater reliability (IRR) must be maintained to achieve reliable clinical trial results. Contract research organizations have generally accepted 6 points of rating disparity between study site raters and central \"master raters\" as concordant, in part because of the personnel turnover and variability within many contract research organizations. We developed and tested an \"insourced\" model using a small, dedicated team of rater program managers (RPMs), to determine whether 3 points of disparity could successfully be demonstrated as a feasible standard for rating concordance.Methods: Site raters recorded and scored all Montgomery-Åsberg Depression Rating Scale (MADRS) interviews. Audio files were independently reviewed and scored by RPMs within 24 to 48 hours. Concordance was defined as the absolute difference in MADRS total score of 3 points or less. A MADRS total score that differed by 4 or more points triggered a discussion with the site rater and additional training, as needed.Results: In a sample of 236 ratings (58 patients), IRR between site ratings and blinded independent RPM ratings was 94.49% (223/236). The lowest concordance, 87.93%, occurred at visit 2, which was the baseline visit in the clinical trial. Concordance rates at visits 3, 4, 5, and 6 were 93.75%, 96.08%, 97.30%, and 100.00%, respectively. The absolute mean difference in MADRS rating pairs was 1.77 points (95% confidence interval: 1.58-1.95). The intraclass correlation was 0.984 and an η2 = 0.992 (F = 124.35, P < 0.0001).Conclusions: Rigorous rater training together with real-time monitoring of site raters by RPMs can achieve a high degree of IRR on the MADRS.","PeriodicalId":15455,"journal":{"name":"Journal of Clinical Psychopharmacology","volume":"45 1","pages":"28-31"},"PeriodicalIF":2.9000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Psychopharmacology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/JCP.0000000000001936","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/21 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Extensive experience with antidepressant clinical trials indicates that interrater reliability (IRR) must be maintained to achieve reliable clinical trial results. Contract research organizations have generally accepted 6 points of rating disparity between study site raters and central "master raters" as concordant, in part because of the personnel turnover and variability within many contract research organizations. We developed and tested an "insourced" model using a small, dedicated team of rater program managers (RPMs), to determine whether 3 points of disparity could successfully be demonstrated as a feasible standard for rating concordance.

Methods: Site raters recorded and scored all Montgomery-Åsberg Depression Rating Scale (MADRS) interviews. Audio files were independently reviewed and scored by RPMs within 24 to 48 hours. Concordance was defined as the absolute difference in MADRS total score of 3 points or less. A MADRS total score that differed by 4 or more points triggered a discussion with the site rater and additional training, as needed.

Results: In a sample of 236 ratings (58 patients), IRR between site ratings and blinded independent RPM ratings was 94.49% (223/236). The lowest concordance, 87.93%, occurred at visit 2, which was the baseline visit in the clinical trial. Concordance rates at visits 3, 4, 5, and 6 were 93.75%, 96.08%, 97.30%, and 100.00%, respectively. The absolute mean difference in MADRS rating pairs was 1.77 points (95% confidence interval: 1.58-1.95). The intraclass correlation was 0.984 and an η2 = 0.992 (F = 124.35, P < 0.0001).

Conclusions: Rigorous rater training together with real-time monitoring of site raters by RPMs can achieve a high degree of IRR on the MADRS.

查看原文本刊更多论文

精神病学临床试验中抑郁评分的质量保证。

背景：抗抑郁药临床试验的丰富经验表明，必须保持试验间信度（IRR）才能获得可靠的临床试验结果。合同研究组织普遍接受研究地点评级员和中央“主评级员”之间6分的评级差异是一致的，部分原因是许多合同研究组织内部的人员流动和可变性。我们开发并测试了一个“内包”模型，使用一个小的、专门的评分项目经理（rpm）团队，以确定3点差异是否可以成功地证明为评分一致性的可行标准。方法：现场评分员记录并评分所有Montgomery-Åsberg抑郁评定量表（MADRS）访谈。音频文件在24至48小时内由rpm独立审查和评分。一致性定义为MADRS总分的绝对差异小于等于3分。如果MADRS总分相差4分或更多，则会触发与现场评分员的讨论，并根据需要进行额外的培训。结果：在236个评分（58例）的样本中，部位评分与盲法独立RPM评分之间的IRR为94.49%（223/236）。最低的一致性为87.93%，发生在第2次就诊，这是临床试验的基线就诊。就诊3、4、5、6次的符合率分别为93.75%、96.08%、97.30%、100.00%。MADRS评分对的绝对平均差为1.77分（95%置信区间：1.58-1.95）。类内相关系数为0.984,η = 0.992 （F = 124.35, P < 0.0001）。结论：严格的评分员培训和rpm对现场评分员的实时监测可以实现MADRS的高IRR。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Clinical Psychopharmacology 医学-精神病学

CiteScore

4.00

自引率

3.40%

发文量

231

审稿时长

4-8 weeks

期刊介绍： Journal of Clinical Psychopharmacology, a leading publication in psychopharmacology, offers a wide range of articles reporting on clinical trials and studies, side effects, drug interactions, overdose management, pharmacogenetics, pharmacokinetics, and psychiatric effects of non-psychiatric drugs. The journal keeps clinician-scientists and trainees up-to-date on the latest clinical developments in psychopharmacologic agents, presenting the extensive coverage needed to keep up with every development in this fast-growing field.