基于人工智能的治疗师技能测量的发展：概念的多模态证明。

IF 3.9 2区心理学 Q2 PSYCHOLOGY, CLINICAL

Psychotherapy Pub Date : 2025-09-01 Epub Date: 2025-02-03 DOI:10.1037/pst0000561

Katie Aafjes-van Doorn, Marcelo Cicconet, Jordan Bate, Jeffrey F Cohn, Marc Aafjes

{"title":"基于人工智能的治疗师技能测量的发展：概念的多模态证明。","authors":"Katie Aafjes-van Doorn, Marcelo Cicconet, Jordan Bate, Jeffrey F Cohn, Marc Aafjes","doi":"10.1037/pst0000561","DOIUrl":null,"url":null,"abstract":"The facilitative interpersonal skills (FIS) task is a performance-based task designed to assess clinicians' capacity for facilitating a collaborative relationship. Performance on FIS is a robust clinician-level predictor of treatment outcomes. However, the FIS task has limited scalability because human rating of FIS requires specialized training and is time-intensive. We aimed to catalyze a \"big needle jump\" by developing an artificial intelligence- (AI-) based automated FIS measurement that captures all behavioral audiovisual markers available to human FIS raters. A total of 956 response clips were collected from 78 mental health clinicians. Three human raters rated the eight FIS subscales and reached sufficient interrater reliability (intraclass correlation based on three raters [ICC3k] for overall FIS = 0.85). We extracted text-, audio-, and video-based features and applied multimodal modeling (multilayer perceptron with a single hidden layer) to predict overall FIS and eight FIS subscales rated along a 1-5 scale continuum. We conducted 10-fold cross-validation analyses. For overall FIS, we reached moderate size relationships with the human-based ratings (Spearman's ρ = .50). Performance for subscales was variable (Spearman's ρ from .30 to .61). Inclusion of audio and video modalities improved the accuracy of the model, especially for the Emotional Expression and Verbal Fluency subscales. All three modalities contributed to the prediction performance, with text-based features contributing relatively most. Our multimodal model performed better than previously published unimodal models on the overall FIS and some FIS subscales. If confirmed in external validation studies, this AI-based FIS measurement may be used for the development of feedback tools for more targeted training, supervision, and deliberate practice. (PsycInfo Database Record (c) 2025 APA, all rights reserved).","PeriodicalId":20910,"journal":{"name":"Psychotherapy","volume":" ","pages":"301-314"},"PeriodicalIF":3.9000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Development of an artificial intelligence-based measure of therapists' skills: A multimodal proof of concept.\",\"authors\":\"Katie Aafjes-van Doorn, Marcelo Cicconet, Jordan Bate, Jeffrey F Cohn, Marc Aafjes\",\"doi\":\"10.1037/pst0000561\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The facilitative interpersonal skills (FIS) task is a performance-based task designed to assess clinicians' capacity for facilitating a collaborative relationship. Performance on FIS is a robust clinician-level predictor of treatment outcomes. However, the FIS task has limited scalability because human rating of FIS requires specialized training and is time-intensive. We aimed to catalyze a \\\"big needle jump\\\" by developing an artificial intelligence- (AI-) based automated FIS measurement that captures all behavioral audiovisual markers available to human FIS raters. A total of 956 response clips were collected from 78 mental health clinicians. Three human raters rated the eight FIS subscales and reached sufficient interrater reliability (intraclass correlation based on three raters [ICC3k] for overall FIS = 0.85). We extracted text-, audio-, and video-based features and applied multimodal modeling (multilayer perceptron with a single hidden layer) to predict overall FIS and eight FIS subscales rated along a 1-5 scale continuum. We conducted 10-fold cross-validation analyses. For overall FIS, we reached moderate size relationships with the human-based ratings (Spearman's ρ = .50). Performance for subscales was variable (Spearman's ρ from .30 to .61). Inclusion of audio and video modalities improved the accuracy of the model, especially for the Emotional Expression and Verbal Fluency subscales. All three modalities contributed to the prediction performance, with text-based features contributing relatively most. Our multimodal model performed better than previously published unimodal models on the overall FIS and some FIS subscales. If confirmed in external validation studies, this AI-based FIS measurement may be used for the development of feedback tools for more targeted training, supervision, and deliberate practice. (PsycInfo Database Record (c) 2025 APA, all rights reserved).\",\"PeriodicalId\":20910,\"journal\":{\"name\":\"Psychotherapy\",\"volume\":\" \",\"pages\":\"301-314\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychotherapy\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1037/pst0000561\",\"RegionNum\":2,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/3 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"PSYCHOLOGY, CLINICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychotherapy","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/pst0000561","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/3 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"PSYCHOLOGY, CLINICAL","Score":null,"Total":0}

引用次数: 0

摘要

促进人际关系技能（FIS）任务是一项基于绩效的任务，旨在评估临床医生促进合作关系的能力。FIS的表现是临床水平治疗结果的可靠预测指标。然而，FIS任务的可扩展性有限，因为人工对FIS进行评级需要专门的培训，而且耗时。我们的目标是通过开发一种基于人工智能（AI）的自动FIS测量方法来催化“大跳针”，该方法可以捕获人类FIS评分者可用的所有行为视听标记。共收集了78名心理健康临床医生的956个回复片段。三位人类评分者对8个FIS分量表进行评分，并达到足够的评分者间信度（基于三位评分者的类内相关性[ICC3k]，总体FIS = 0.85）。我们提取了基于文本、音频和视频的特征，并应用多模态建模（带有单个隐藏层的多层感知器）来预测总体FIS和沿1-5个尺度连续体评定的8个FIS子量表。我们进行了10倍交叉验证分析。对于整体FIS，我们与基于人的评级达到了中等大小的关系（Spearman的ρ = 0.50）。子量表的表现是可变的（Spearman的ρ从0.30到0.61）。音频和视频模式的加入提高了模型的准确性，特别是对于情绪表达和语言流畅度的子量表。所有三种模式都对预测性能有贡献，其中基于文本的特征贡献相对最大。我们的多模态模型在整个FIS和一些FIS子尺度上比以前发表的单模态模型表现得更好。如果在外部验证研究中得到证实，这种基于人工智能的FIS测量可以用于开发反馈工具，以进行更有针对性的培训、监督和刻意练习。（PsycInfo Database Record (c) 2025 APA，版权所有）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Development of an artificial intelligence-based measure of therapists' skills: A multimodal proof of concept.

The facilitative interpersonal skills (FIS) task is a performance-based task designed to assess clinicians' capacity for facilitating a collaborative relationship. Performance on FIS is a robust clinician-level predictor of treatment outcomes. However, the FIS task has limited scalability because human rating of FIS requires specialized training and is time-intensive. We aimed to catalyze a "big needle jump" by developing an artificial intelligence- (AI-) based automated FIS measurement that captures all behavioral audiovisual markers available to human FIS raters. A total of 956 response clips were collected from 78 mental health clinicians. Three human raters rated the eight FIS subscales and reached sufficient interrater reliability (intraclass correlation based on three raters [ICC3k] for overall FIS = 0.85). We extracted text-, audio-, and video-based features and applied multimodal modeling (multilayer perceptron with a single hidden layer) to predict overall FIS and eight FIS subscales rated along a 1-5 scale continuum. We conducted 10-fold cross-validation analyses. For overall FIS, we reached moderate size relationships with the human-based ratings (Spearman's ρ = .50). Performance for subscales was variable (Spearman's ρ from .30 to .61). Inclusion of audio and video modalities improved the accuracy of the model, especially for the Emotional Expression and Verbal Fluency subscales. All three modalities contributed to the prediction performance, with text-based features contributing relatively most. Our multimodal model performed better than previously published unimodal models on the overall FIS and some FIS subscales. If confirmed in external validation studies, this AI-based FIS measurement may be used for the development of feedback tools for more targeted training, supervision, and deliberate practice. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Psychotherapy PSYCHOLOGY, CLINICAL-

CiteScore

4.60

自引率

12.00%

发文量

期刊介绍： Psychotherapy Theory, Research, Practice, Training publishes a wide variety of articles relevant to the field of psychotherapy. The journal strives to foster interactions among individuals involved with training, practice theory, and research since all areas are essential to psychotherapy. This journal is an invaluable resource for practicing clinical and counseling psychologists, social workers, and mental health professionals.