利用大型语言模型模拟心理治疗客户交互：Client101的开发和可用性研究。

IF 3.2 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education Pub Date : 2025-07-31 DOI:10.2196/68056

Daniel Cabrera Lozoya, Mike Conway, Edoardo Sebastiano De Duro, Simon D'Alfonso

{"title":"利用大型语言模型模拟心理治疗客户交互：Client101的开发和可用性研究。","authors":"Daniel Cabrera Lozoya, Mike Conway, Edoardo Sebastiano De Duro, Simon D'Alfonso","doi":"10.2196/68056","DOIUrl":null,"url":null,"abstract":"Background: In recent years, large language models (LLMs) have shown a remarkable ability to generate human-like text. One potential application of this capability is using LLMs to simulate clients in a mental health context. This research presents the development and evaluation of Client101, a web conversational platform featuring LLM-driven chatbots designed to simulate mental health clients.Objective: We aim to develop and test a web-based conversational psychotherapy training tool designed to closely resemble clients with mental health issues.Methods: We used GPT-4 and prompt engineering techniques to develop chatbots that simulate realistic client conversations. Two chatbots were created based on clinical vignette cases: one representing a person with depression and the other, a person with generalized anxiety disorder. A total of 16 mental health professionals were instructed to conduct single sessions with the chatbots using a cognitive behavioral therapy framework; a total of 15 sessions with the anxiety chatbot and 14 with the depression chatbot were completed. After each session, participants completed a 19-question survey assessing the chatbot's ability to simulate the mental health condition and its potential as a training tool. Additionally, we used the LIWC (Linguistic Inquiry and Word Count) tool to analyze the psycholinguistic features of the chatbot conversations related to anxiety and depression. These features were compared to those in a set of webchat psychotherapy sessions with human clients-42 sessions related to anxiety and 47 related to depression-using an independent samples t test.Results: Participants' survey responses were predominantly positive regarding the chatbots' realism and portrayal of mental health conditions. For instance, 93% (14/15) considered that the chatbot provided a coherent and convincing narrative typical of someone with an anxiety condition. The statistical analysis of LIWC psycholinguistic features revealed significant differences between chatbot and human therapy transcripts for 3 of 8 anxiety-related features: negations (t56=4.03, P=.001), family (t56=-8.62, P=.001), and negative emotions (t56=-3.91, P=.002). The remaining 5 features-sadness, personal pronouns, present focus, social, and anger-did not show significant differences. For depression-related features, 4 of 9 showed significant differences: negative emotions (t60=-3.84, P=.003), feeling (t60=-6.40, P<.001), health (t60=-4.13, P=.001), and illness (t60=-5.52, P<.001). The other 5 features-sadness, anxiety, mental, first-person pronouns, and discrepancy-did not show statistically significant differences.Conclusions: This research underscores both the strengths and limitations of using GPT-4-powered chatbots as tools for psychotherapy training. Participant feedback suggests that the chatbots effectively portray mental health conditions and are generally perceived as valuable training aids. However, differences in specific psycholinguistic features suggest targeted areas for enhancement, helping refine Client101's effectiveness as a tool for training mental health professionals.","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"11 ","pages":"e68056"},"PeriodicalIF":3.2000,"publicationDate":"2025-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12312989/pdf/","citationCount":"0","resultStr":"{\"title\":\"Leveraging Large Language Models for Simulated Psychotherapy Client Interactions: Development and Usability Study of Client101.\",\"authors\":\"Daniel Cabrera Lozoya, Mike Conway, Edoardo Sebastiano De Duro, Simon D'Alfonso\",\"doi\":\"10.2196/68056\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: In recent years, large language models (LLMs) have shown a remarkable ability to generate human-like text. One potential application of this capability is using LLMs to simulate clients in a mental health context. This research presents the development and evaluation of Client101, a web conversational platform featuring LLM-driven chatbots designed to simulate mental health clients.Objective: We aim to develop and test a web-based conversational psychotherapy training tool designed to closely resemble clients with mental health issues.Methods: We used GPT-4 and prompt engineering techniques to develop chatbots that simulate realistic client conversations. Two chatbots were created based on clinical vignette cases: one representing a person with depression and the other, a person with generalized anxiety disorder. A total of 16 mental health professionals were instructed to conduct single sessions with the chatbots using a cognitive behavioral therapy framework; a total of 15 sessions with the anxiety chatbot and 14 with the depression chatbot were completed. After each session, participants completed a 19-question survey assessing the chatbot's ability to simulate the mental health condition and its potential as a training tool. Additionally, we used the LIWC (Linguistic Inquiry and Word Count) tool to analyze the psycholinguistic features of the chatbot conversations related to anxiety and depression. These features were compared to those in a set of webchat psychotherapy sessions with human clients-42 sessions related to anxiety and 47 related to depression-using an independent samples t test.Results: Participants' survey responses were predominantly positive regarding the chatbots' realism and portrayal of mental health conditions. For instance, 93% (14/15) considered that the chatbot provided a coherent and convincing narrative typical of someone with an anxiety condition. The statistical analysis of LIWC psycholinguistic features revealed significant differences between chatbot and human therapy transcripts for 3 of 8 anxiety-related features: negations (t56=4.03, P=.001), family (t56=-8.62, P=.001), and negative emotions (t56=-3.91, P=.002). The remaining 5 features-sadness, personal pronouns, present focus, social, and anger-did not show significant differences. For depression-related features, 4 of 9 showed significant differences: negative emotions (t60=-3.84, P=.003), feeling (t60=-6.40, P<.001), health (t60=-4.13, P=.001), and illness (t60=-5.52, P<.001). The other 5 features-sadness, anxiety, mental, first-person pronouns, and discrepancy-did not show statistically significant differences.Conclusions: This research underscores both the strengths and limitations of using GPT-4-powered chatbots as tools for psychotherapy training. Participant feedback suggests that the chatbots effectively portray mental health conditions and are generally perceived as valuable training aids. However, differences in specific psycholinguistic features suggest targeted areas for enhancement, helping refine Client101's effectiveness as a tool for training mental health professionals.\",\"PeriodicalId\":36236,\"journal\":{\"name\":\"JMIR Medical Education\",\"volume\":\"11 \",\"pages\":\"e68056\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12312989/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR Medical Education\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/68056\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION, SCIENTIFIC DISCIPLINES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/68056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}

引用次数: 0

摘要

背景：近年来，大型语言模型（llm）在生成类人文本方面表现出了非凡的能力。这种能力的一个潜在应用是使用法学硕士来模拟心理健康背景下的客户。本研究介绍了Client101的开发和评估，这是一个网络会话平台，具有法学硕士驱动的聊天机器人，旨在模拟心理健康客户。目的：我们的目标是开发和测试一个基于网络的会话心理治疗培训工具，旨在与有心理健康问题的客户密切相似。方法：我们使用GPT-4和提示工程技术来开发模拟真实客户对话的聊天机器人。两个聊天机器人是根据临床案例创建的：一个代表抑郁症患者，另一个代表广泛性焦虑症患者。总共有16名心理健康专业人员被指示使用认知行为治疗框架与聊天机器人进行单次会话；与焦虑聊天机器人共完成了15次会话，与抑郁聊天机器人共完成了14次会话。每次训练结束后，参与者完成一份19个问题的调查，评估聊天机器人模拟心理健康状况的能力，以及作为训练工具的潜力。此外，我们使用LIWC（语言查询和字数统计）工具来分析与焦虑和抑郁相关的聊天机器人对话的心理语言特征。使用独立样本t检验，将这些特征与一组与人类客户进行的网络聊天心理治疗会话（42次与焦虑有关，47次与抑郁有关）中的特征进行比较。结果：参与者对聊天机器人的现实主义和对心理健康状况的描述的调查反应主要是积极的。例如，93%（14/15）的人认为聊天机器人提供了一个连贯而令人信服的故事，这是焦虑患者的典型特征。LIWC心理语言特征的统计分析显示，聊天机器人和人类治疗记录在8个焦虑相关特征中的3个方面存在显著差异：消极（t56=4.03， P=.001）、家庭（t56=-8.62， P=.001）和消极情绪（t56=-3.91， P=.002）。剩下的5个特征——悲伤、人称代词、当下焦点、社交和愤怒——并没有显示出显著的差异。对于抑郁相关特征，9个特征中有4个表现出显著差异：负面情绪（t60=-3.84， P= 0.003），感觉（t60=-6.40， P）。结论：本研究强调了使用gpt -4驱动的聊天机器人作为心理治疗培训工具的优势和局限性。参与者的反馈表明，聊天机器人有效地描述了心理健康状况，通常被认为是有价值的培训辅助工具。然而，特定心理语言特征的差异提示了目标区域的增强，有助于完善Client101作为培训心理健康专业人员的工具的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Leveraging Large Language Models for Simulated Psychotherapy Client Interactions: Development and Usability Study of Client101.

查看原文本刊更多论文

Leveraging Large Language Models for Simulated Psychotherapy Client Interactions: Development and Usability Study of Client101.

Background: In recent years, large language models (LLMs) have shown a remarkable ability to generate human-like text. One potential application of this capability is using LLMs to simulate clients in a mental health context. This research presents the development and evaluation of Client101, a web conversational platform featuring LLM-driven chatbots designed to simulate mental health clients.

Objective: We aim to develop and test a web-based conversational psychotherapy training tool designed to closely resemble clients with mental health issues.

Methods: We used GPT-4 and prompt engineering techniques to develop chatbots that simulate realistic client conversations. Two chatbots were created based on clinical vignette cases: one representing a person with depression and the other, a person with generalized anxiety disorder. A total of 16 mental health professionals were instructed to conduct single sessions with the chatbots using a cognitive behavioral therapy framework; a total of 15 sessions with the anxiety chatbot and 14 with the depression chatbot were completed. After each session, participants completed a 19-question survey assessing the chatbot's ability to simulate the mental health condition and its potential as a training tool. Additionally, we used the LIWC (Linguistic Inquiry and Word Count) tool to analyze the psycholinguistic features of the chatbot conversations related to anxiety and depression. These features were compared to those in a set of webchat psychotherapy sessions with human clients-42 sessions related to anxiety and 47 related to depression-using an independent samples t test.

Results: Participants' survey responses were predominantly positive regarding the chatbots' realism and portrayal of mental health conditions. For instance, 93% (14/15) considered that the chatbot provided a coherent and convincing narrative typical of someone with an anxiety condition. The statistical analysis of LIWC psycholinguistic features revealed significant differences between chatbot and human therapy transcripts for 3 of 8 anxiety-related features: negations (t56=4.03, P=.001), family (t56=-8.62, P=.001), and negative emotions (t56=-3.91, P=.002). The remaining 5 features-sadness, personal pronouns, present focus, social, and anger-did not show significant differences. For depression-related features, 4 of 9 showed significant differences: negative emotions (t60=-3.84, P=.003), feeling (t60=-6.40, P<.001), health (t60=-4.13, P=.001), and illness (t60=-5.52, P<.001). The other 5 features-sadness, anxiety, mental, first-person pronouns, and discrepancy-did not show statistically significant differences.

Conclusions: This research underscores both the strengths and limitations of using GPT-4-powered chatbots as tools for psychotherapy training. Participant feedback suggests that the chatbots effectively portray mental health conditions and are generally perceived as valuable training aids. However, differences in specific psycholinguistic features suggest targeted areas for enhancement, helping refine Client101's effectiveness as a tool for training mental health professionals.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JMIR Medical Education Social Sciences-Education

CiteScore

6.90

自引率

5.60%

发文量

审稿时长

8 weeks