Exploring the application of large language models in coding the experiencing scale (EXP).

Cogent mental health Pub Date : 2026-04-23 eCollection Date: 2026-01-01 DOI:10.1080/28324765.2026.2664163

Brian Yim, J Christopher Muran, Qianying Ren, Bernard Gorman

{"title":"Exploring the application of large language models in coding the experiencing scale (EXP).","authors":"Brian Yim, J Christopher Muran, Qianying Ren, Bernard Gorman","doi":"10.1080/28324765.2026.2664163","DOIUrl":null,"url":null,"abstract":"<p><p>Psychotherapy process measures like the Experiencing Scale (EXP) offer valuable insight into clinical interactions but are time-intensive to code. Large language models (LLMs) like ChatGPT have the potential to streamline this process, but empirical validation is nascent. This exploratory study aimed to provide a proof-of-concept coding the EXP using ChatGPT with special attention to ethical considerations, limitations, and future directions. ChatGPT was used to code 79 psychotherapy transcripts drawn from the EXP manual. Multiple models of ChatGPT were tested using varied few-shot learning prompt engineering protocols. Data collection occurred in three phases, during which models rated both modal and peak EXP scores for all transcripts. ChatGPT demonstrated moderate agreement with manual reference ratings. An efficient configuration (o3-mini, 5-shot prompting) yielded moderate reliability for both modal EXP scores (ICC[3,1] = .67, 95% CI [.53, .79]) and peak EXP scores (ICC[3,1] = .71, 95% CI [.58, .81]). LLMs may feasibly augment or replace human EXP coders under certain conditions. However, evidence is preliminary and ethical and technical limitations remain. Future research should validate the present methodology using out-of-manual data, assess potential pretraining exposure, and explore locally hosted LLM applications to mitigate privacy concerns.</p>","PeriodicalId":72633,"journal":{"name":"Cogent mental health","volume":"5 1","pages":"2664163"},"PeriodicalIF":0.0000,"publicationDate":"2026-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13107987/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cogent mental health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/28324765.2026.2664163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Psychotherapy process measures like the Experiencing Scale (EXP) offer valuable insight into clinical interactions but are time-intensive to code. Large language models (LLMs) like ChatGPT have the potential to streamline this process, but empirical validation is nascent. This exploratory study aimed to provide a proof-of-concept coding the EXP using ChatGPT with special attention to ethical considerations, limitations, and future directions. ChatGPT was used to code 79 psychotherapy transcripts drawn from the EXP manual. Multiple models of ChatGPT were tested using varied few-shot learning prompt engineering protocols. Data collection occurred in three phases, during which models rated both modal and peak EXP scores for all transcripts. ChatGPT demonstrated moderate agreement with manual reference ratings. An efficient configuration (o3-mini, 5-shot prompting) yielded moderate reliability for both modal EXP scores (ICC[3,1] = .67, 95% CI [.53, .79]) and peak EXP scores (ICC[3,1] = .71, 95% CI [.58, .81]). LLMs may feasibly augment or replace human EXP coders under certain conditions. However, evidence is preliminary and ethical and technical limitations remain. Future research should validate the present methodology using out-of-manual data, assess potential pretraining exposure, and explore locally hosted LLM applications to mitigate privacy concerns.

查看原文本刊更多论文

探索大语言模型在经验量表（EXP）编码中的应用。

像体验量表（EXP）这样的心理治疗过程测量为临床互动提供了有价值的见解，但需要花费大量时间进行编码。像ChatGPT这样的大型语言模型（llm）有可能简化这一过程，但经验验证还处于萌芽阶段。本探索性研究旨在提供使用ChatGPT编码EXP的概念验证，并特别注意伦理考虑、限制和未来方向。ChatGPT被用于编码从EXP手册中提取的79个心理治疗记录。使用不同的少量学习提示工程协议对多个ChatGPT模型进行了测试。数据收集分三个阶段进行，在此期间，模型对所有转录本的模态和峰值EXP评分。ChatGPT显示了与手动参考评级的适度一致。有效的配置（3-mini， 5次提示）对两种模态EXP得分（ICC[3,1] =）产生中等可靠性。67, 95% ci[。53岁。79])和最高EXP分数(ICC[3,1] =。71, 95% ci[。58岁的结果)。在某些条件下，llm可以增强或取代人类EXP编码器。然而，证据是初步的，道德和技术限制仍然存在。未来的研究应该使用人工数据验证目前的方法，评估潜在的预训练风险，并探索本地托管的LLM应用程序，以减轻隐私问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Cogent mental health

自引率

0.00%

发文量