Capitalizing on natural language processing (NLP) to automate the evaluation of coach implementation fidelity in guided digital cognitive-behavioral therapy (GdCBT).

IF 5.9 2区医学 Q1 PSYCHIATRY

Psychological Medicine Pub Date : 2025-04-02 DOI:10.1017/S0033291725000340

Nur Hani Zainal, Regina Eckhardt, Gavin N Rackoff, Ellen E Fitzsimmons-Craft, Elsa Rojas-Ashe, Craig Barr Taylor, Burkhardt Funk, Daniel Eisenberg, Denise E Wilfley, Michelle G Newman

{"title":"Capitalizing on natural language processing (NLP) to automate the evaluation of coach implementation fidelity in guided digital cognitive-behavioral therapy (GdCBT).","authors":"Nur Hani Zainal, Regina Eckhardt, Gavin N Rackoff, Ellen E Fitzsimmons-Craft, Elsa Rojas-Ashe, Craig Barr Taylor, Burkhardt Funk, Daniel Eisenberg, Denise E Wilfley, Michelle G Newman","doi":"10.1017/S0033291725000340","DOIUrl":null,"url":null,"abstract":"Background: As the use of guided digitally-delivered cognitive-behavioral therapy (GdCBT) grows, pragmatic analytic tools are needed to evaluate coaches' implementation fidelity.Aims: We evaluated how natural language processing (NLP) and machine learning (ML) methods might automate the monitoring of coaches' implementation fidelity to GdCBT delivered as part of a randomized controlled trial.Method: Coaches served as guides to 6-month GdCBT with 3,381 assigned users with or at risk for anxiety, depression, or eating disorders. CBT-trained and supervised human coders used a rubric to rate the implementation fidelity of 13,529 coach-to-user messages. NLP methods abstracted data from text-based coach-to-user messages, and 11 ML models predicting coach implementation fidelity were evaluated.Results: Inter-rater agreement by human coders was excellent (intra-class correlation coefficient = .980-.992). Coaches achieved behavioral targets at the start of the GdCBT and maintained strong fidelity throughout most subsequent messages. Coaches also avoided prohibited actions (e.g. reinforcing users' avoidance). Sentiment analyses generally indicated a higher frequency of coach-delivered positive than negative sentiment words and predicted coach implementation fidelity with acceptable performance metrics (e.g. area under the receiver operating characteristic curve [AUC] = 74.48%). The final best-performing ML algorithms that included a more comprehensive set of NLP features performed well (e.g. AUC = 76.06%).Conclusions: NLP and ML tools could help clinical supervisors automate monitoring of coaches' implementation fidelity to GdCBT. These tools could maximize allocation of scarce resources by reducing the personnel time needed to measure fidelity, potentially freeing up more time for high-quality clinical care.","PeriodicalId":20891,"journal":{"name":"Psychological Medicine","volume":"55 ","pages":"e106"},"PeriodicalIF":5.9000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12094662/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1017/S0033291725000340","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: As the use of guided digitally-delivered cognitive-behavioral therapy (GdCBT) grows, pragmatic analytic tools are needed to evaluate coaches' implementation fidelity.

Aims: We evaluated how natural language processing (NLP) and machine learning (ML) methods might automate the monitoring of coaches' implementation fidelity to GdCBT delivered as part of a randomized controlled trial.

Method: Coaches served as guides to 6-month GdCBT with 3,381 assigned users with or at risk for anxiety, depression, or eating disorders. CBT-trained and supervised human coders used a rubric to rate the implementation fidelity of 13,529 coach-to-user messages. NLP methods abstracted data from text-based coach-to-user messages, and 11 ML models predicting coach implementation fidelity were evaluated.

Results: Inter-rater agreement by human coders was excellent (intra-class correlation coefficient = .980-.992). Coaches achieved behavioral targets at the start of the GdCBT and maintained strong fidelity throughout most subsequent messages. Coaches also avoided prohibited actions (e.g. reinforcing users' avoidance). Sentiment analyses generally indicated a higher frequency of coach-delivered positive than negative sentiment words and predicted coach implementation fidelity with acceptable performance metrics (e.g. area under the receiver operating characteristic curve [AUC] = 74.48%). The final best-performing ML algorithms that included a more comprehensive set of NLP features performed well (e.g. AUC = 76.06%).

Conclusions: NLP and ML tools could help clinical supervisors automate monitoring of coaches' implementation fidelity to GdCBT. These tools could maximize allocation of scarce resources by reducing the personnel time needed to measure fidelity, potentially freeing up more time for high-quality clinical care.

查看原文本刊更多论文

利用自然语言处理（NLP）在指导性数字认知行为疗法（GdCBT）中自动评估教练实施保真度。

背景：随着指导性的数字化认知行为疗法（GdCBT）的使用越来越多，需要实用的分析工具来评估教练的实施保真度。目的：我们评估了自然语言处理（NLP）和机器学习（ML）方法如何自动监测教练对GdCBT的实施保真度，这是一项随机对照试验的一部分。方法：教练对3381名有焦虑、抑郁或饮食失调风险的用户进行为期6个月的GdCBT指导。cbt训练和监督的人类编码人员使用一个标准来评估13529个教练对用户的消息的实现保真度。NLP方法从基于文本的教练到用户的消息中提取数据，并评估了11个预测教练实现保真度的ML模型。结果：编码员间的一致性极好（类内相关系数= 0.980 ~ 0.992）。教练在GdCBT开始时就达到了行为目标，并在随后的大部分信息中保持了高度的忠诚。教练也避免被禁止的行为（例如加强用户的回避）。情绪分析通常表明，教练传递的积极情绪词的频率高于消极情绪词，并通过可接受的绩效指标（例如，接受者工作特征曲线下面积[AUC] = 74.48%）预测教练的执行保真度。包含更全面的NLP特征集的最终表现最好的ML算法表现良好（例如AUC = 76.06%）。结论：NLP和ML工具可以帮助临床督导人员自动监测教练员对GdCBT的实施保真度。这些工具可以通过减少测量保真度所需的人员时间来最大限度地分配稀缺资源，从而有可能为高质量的临床护理腾出更多时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Psychological Medicine 医学-精神病学

CiteScore

11.30

自引率

4.30%

发文量

711

审稿时长

3-6 weeks

期刊介绍： Now in its fifth decade of publication, Psychological Medicine is a leading international journal in the fields of psychiatry, related aspects of psychology and basic sciences. From 2014, there are 16 issues a year, each featuring original articles reporting key research being undertaken worldwide, together with shorter editorials by distinguished scholars and an important book review section. The journal''s success is clearly demonstrated by a consistently high impact factor.