利用大语言模型和隐马尔可夫模型评估动机访谈的质量。

IF 3.4 2区 医学 Q2 PSYCHIATRY
Kyungho Lim, Young-Chul Jung, Byung-Hoon Kim
{"title":"利用大语言模型和隐马尔可夫模型评估动机访谈的质量。","authors":"Kyungho Lim, Young-Chul Jung, Byung-Hoon Kim","doi":"10.1186/s12888-025-07391-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Motivational Interviewing (MI) is a counseling approach that promotes behavior change by eliciting \"change talk\" and minimizing \"sustain talk.\" Traditional methods for assessing MI quality, such as manual coding, are labor-intensive, subjective, and difficult to scale. This study introduces an automated framework integrating large language models (LLMs) and Hidden Markov Models (HMMs) for evaluation of MI session quality.</p><p><strong>Aims: </strong>This study evaluates the effectiveness of an LLM-HMM framework in predicting MI session quality and examines motivational state transitions in high- and low-quality sessions.</p><p><strong>Method: </strong>A dataset of 40 MI sessions was analyzed. Client utterances were classified and numerically scored by an LLM based on their intention toward or away from change. With HMMs, we used these scores to examine the motivational state transitions across each session. Differences between high- and low-quality sessions were quantified by comparing transition matrices using Frobenius norms. Statistical significance was assessed via a permutation test. Predictive performance was evaluated using logistic regression with leave-one-out cross-validation (LOOCV), where transition matrix elements served as independent variables and interview quality as the dependent variable.</p><p><strong>Results: </strong>High-quality MI sessions exhibited fluid transitions between motivational states, whereas low-quality sessions showed persistence in resistance-oriented states. A statistically significant difference in transition matrices was observed between session groups (p < 0.001). The framework achieved a mean LOOCV accuracy of 0.80, demonstrating strong predictive performance in identifying MI session quality.</p><p><strong>Conclusions: </strong>This study presents a scalable, objective alternative to manual MI evaluation. Future applications may include real-time therapist support, training, and prognosis prediction, pending further validation on field-collected data.</p>","PeriodicalId":9029,"journal":{"name":"BMC Psychiatry","volume":"25 1","pages":"908"},"PeriodicalIF":3.4000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating motivational interview quality using large language models and hidden Markov models.\",\"authors\":\"Kyungho Lim, Young-Chul Jung, Byung-Hoon Kim\",\"doi\":\"10.1186/s12888-025-07391-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Motivational Interviewing (MI) is a counseling approach that promotes behavior change by eliciting \\\"change talk\\\" and minimizing \\\"sustain talk.\\\" Traditional methods for assessing MI quality, such as manual coding, are labor-intensive, subjective, and difficult to scale. This study introduces an automated framework integrating large language models (LLMs) and Hidden Markov Models (HMMs) for evaluation of MI session quality.</p><p><strong>Aims: </strong>This study evaluates the effectiveness of an LLM-HMM framework in predicting MI session quality and examines motivational state transitions in high- and low-quality sessions.</p><p><strong>Method: </strong>A dataset of 40 MI sessions was analyzed. Client utterances were classified and numerically scored by an LLM based on their intention toward or away from change. With HMMs, we used these scores to examine the motivational state transitions across each session. Differences between high- and low-quality sessions were quantified by comparing transition matrices using Frobenius norms. Statistical significance was assessed via a permutation test. Predictive performance was evaluated using logistic regression with leave-one-out cross-validation (LOOCV), where transition matrix elements served as independent variables and interview quality as the dependent variable.</p><p><strong>Results: </strong>High-quality MI sessions exhibited fluid transitions between motivational states, whereas low-quality sessions showed persistence in resistance-oriented states. A statistically significant difference in transition matrices was observed between session groups (p < 0.001). The framework achieved a mean LOOCV accuracy of 0.80, demonstrating strong predictive performance in identifying MI session quality.</p><p><strong>Conclusions: </strong>This study presents a scalable, objective alternative to manual MI evaluation. Future applications may include real-time therapist support, training, and prognosis prediction, pending further validation on field-collected data.</p>\",\"PeriodicalId\":9029,\"journal\":{\"name\":\"BMC Psychiatry\",\"volume\":\"25 1\",\"pages\":\"908\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Psychiatry\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12888-025-07391-1\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PSYCHIATRY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Psychiatry","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12888-025-07391-1","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHIATRY","Score":null,"Total":0}
引用次数: 0

摘要

背景:动机性访谈(MI)是一种通过诱导“改变谈话”和最小化“维持谈话”来促进行为改变的咨询方法。评估人工智能质量的传统方法,如手工编码,是劳动密集型的,主观的,并且难以扩展。本研究引入了一个集成大型语言模型(llm)和隐马尔可夫模型(hmm)的自动化框架,用于MI会话质量评估。目的:本研究评估了LLM-HMM框架在预测MI会话质量方面的有效性,并检查了高质量和低质量会话中的动机状态转换。方法:对40个MI会话数据集进行分析。客户的话语被分类并由法学硕士根据他们倾向或远离改变的意图进行数字评分。对于hmm,我们使用这些分数来检查每个会话中的动机状态转换。高质量和低质量会话之间的差异通过使用Frobenius规范比较转移矩阵来量化。通过排列检验评估统计学显著性。预测性能使用逻辑回归与留一交叉验证(LOOCV)进行评估,其中转移矩阵元素作为自变量,访谈质量作为因变量。结果:高质量的心肌梗死在动机状态之间表现出流畅的过渡,而低质量的心肌梗死在阻力导向状态中表现出持久性。在会话组之间观察到转移矩阵的统计显著差异(p)。结论:本研究提出了一种可扩展的、客观的替代人工心肌梗死评估的方法。未来的应用可能包括实时治疗师支持、培训和预后预测,有待于现场收集数据的进一步验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating motivational interview quality using large language models and hidden Markov models.

Background: Motivational Interviewing (MI) is a counseling approach that promotes behavior change by eliciting "change talk" and minimizing "sustain talk." Traditional methods for assessing MI quality, such as manual coding, are labor-intensive, subjective, and difficult to scale. This study introduces an automated framework integrating large language models (LLMs) and Hidden Markov Models (HMMs) for evaluation of MI session quality.

Aims: This study evaluates the effectiveness of an LLM-HMM framework in predicting MI session quality and examines motivational state transitions in high- and low-quality sessions.

Method: A dataset of 40 MI sessions was analyzed. Client utterances were classified and numerically scored by an LLM based on their intention toward or away from change. With HMMs, we used these scores to examine the motivational state transitions across each session. Differences between high- and low-quality sessions were quantified by comparing transition matrices using Frobenius norms. Statistical significance was assessed via a permutation test. Predictive performance was evaluated using logistic regression with leave-one-out cross-validation (LOOCV), where transition matrix elements served as independent variables and interview quality as the dependent variable.

Results: High-quality MI sessions exhibited fluid transitions between motivational states, whereas low-quality sessions showed persistence in resistance-oriented states. A statistically significant difference in transition matrices was observed between session groups (p < 0.001). The framework achieved a mean LOOCV accuracy of 0.80, demonstrating strong predictive performance in identifying MI session quality.

Conclusions: This study presents a scalable, objective alternative to manual MI evaluation. Future applications may include real-time therapist support, training, and prognosis prediction, pending further validation on field-collected data.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
BMC Psychiatry
BMC Psychiatry 医学-精神病学
CiteScore
5.90
自引率
4.50%
发文量
716
审稿时长
3-6 weeks
期刊介绍: BMC Psychiatry is an open access, peer-reviewed journal that considers articles on all aspects of the prevention, diagnosis and management of psychiatric disorders, as well as related molecular genetics, pathophysiology, and epidemiology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信