Development and preliminary testing of a secure large language model-based chatbot for brief alcohol counseling in young adults

IF 3.9 2区医学 Q1 PSYCHIATRY

Drug and alcohol dependence Pub Date : 2025-04-28 DOI:10.1016/j.drugalcdep.2025.112697

Brian Suffoletto , Duncan B. Clark , Christine Lee , Michael Mason , Jordan Schultz , Irvin Szeto , Denise Walker

{"title":"Development and preliminary testing of a secure large language model-based chatbot for brief alcohol counseling in young adults","authors":"Brian Suffoletto , Duncan B. Clark , Christine Lee , Michael Mason , Jordan Schultz , Irvin Szeto , Denise Walker","doi":"10.1016/j.drugalcdep.2025.112697","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Young adults face elevated risks from alcohol use yet encounter significant barriers to accessing evidence-based interventions. Large language models (LLMs) represent a promising advancement for delivering personalized behavioral interventions, but their application to alcohol counseling remains unexplored. This study evaluated the development and preliminary outcomes of a Secure GPT-4-powered text-based Motivational Interviewing Conversational Agent (MICA).</div></div><div><h3>Method</h3><div>Using a prospective single-arm pilot design, we evaluated MICA across two phases (Phase I: n = 8; Phase II: n = 37), editing the LLM prompts between Phases. Participants aged 18–25 who reported consuming ≥ 10 standard alcohol units weekly completed a counseling session with MICA. We evaluated safety and compared MI fidelity (relational and technical sub-scales of the Client Evaluation of MI [CEMI]) and usability (System Usability Scale) between Phases. We also explored surrogate measures of effectiveness (i.e. proportion of change talk to sustain talk from session logs) and qualitative feedback themes.</div></div><div><h3>Results</h3><div>No unsafe responses were observed. MI fidelity improved significantly in the CEMI relational sub-scale from Phase I to II (67.2 % to 82.6 %, p = 0.03). Usability remained consistently high across phases (Phase I: 85.4; Phase II: 80.9; p = 0.45). The proportion of within-session change talk was also consistently high (Phase I: 65.2 %; Phase II: 75.8 %; p = 0.10).</div></div><div><h3>Conclusions</h3><div>This study provides preliminary evidence that LLM-based chatbots can deliver MI-adherent alcohol interventions that are both acceptable to young adults and maintain high MI fidelity. Future research should employ randomized controlled designs with longer follow-up periods to evaluate impact on drinking outcomes.</div></div>","PeriodicalId":11322,"journal":{"name":"Drug and alcohol dependence","volume":"272 ","pages":"Article 112697"},"PeriodicalIF":3.9000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Drug and alcohol dependence","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0376871625001504","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}

引用次数: 0

Abstract

Objective

Young adults face elevated risks from alcohol use yet encounter significant barriers to accessing evidence-based interventions. Large language models (LLMs) represent a promising advancement for delivering personalized behavioral interventions, but their application to alcohol counseling remains unexplored. This study evaluated the development and preliminary outcomes of a Secure GPT-4-powered text-based Motivational Interviewing Conversational Agent (MICA).

Method

Using a prospective single-arm pilot design, we evaluated MICA across two phases (Phase I: n = 8; Phase II: n = 37), editing the LLM prompts between Phases. Participants aged 18–25 who reported consuming ≥ 10 standard alcohol units weekly completed a counseling session with MICA. We evaluated safety and compared MI fidelity (relational and technical sub-scales of the Client Evaluation of MI [CEMI]) and usability (System Usability Scale) between Phases. We also explored surrogate measures of effectiveness (i.e. proportion of change talk to sustain talk from session logs) and qualitative feedback themes.

Results

No unsafe responses were observed. MI fidelity improved significantly in the CEMI relational sub-scale from Phase I to II (67.2 % to 82.6 %, p = 0.03). Usability remained consistently high across phases (Phase I: 85.4; Phase II: 80.9; p = 0.45). The proportion of within-session change talk was also consistently high (Phase I: 65.2 %; Phase II: 75.8 %; p = 0.10).

Conclusions

This study provides preliminary evidence that LLM-based chatbots can deliver MI-adherent alcohol interventions that are both acceptable to young adults and maintain high MI fidelity. Future research should employ randomized controlled designs with longer follow-up periods to evaluate impact on drinking outcomes.

查看原文本刊更多论文

开发和初步测试一个安全的大型语言模型为基础的聊天机器人简短的酒精咨询的年轻人

目的：年轻人面临酒精使用的风险增加，但在获得循证干预措施方面遇到重大障碍。大型语言模型（llm）代表了提供个性化行为干预的一个有希望的进步，但它们在酒精咨询中的应用仍未被探索。本研究评估了Secure gpt -4驱动的基于文本的动机性访谈会话代理（MICA）的开发和初步结果。方法采用前瞻性单臂试验设计，我们评估了两个阶段的MICA(第一阶段：n = 8；阶段II: n = 37)，编辑阶段之间的LLM提示。年龄在18-25岁、每周饮酒≥10个标准酒精单位的参与者完成了MICA的咨询。我们评估了安全性，并比较了不同阶段的MI保真度（MI客户评估[CEMI]的关系和技术子量表）和可用性（系统可用性量表）。我们还探索了有效性的替代度量（例如，从会话日志中进行的变更谈话与持续谈话的比例）和定性反馈主题。结果未见不安全反应。从第一阶段到第二阶段，CEMI相关子量表的MI保真度显著提高（67.2%到82.6%,p = 0.03）。可用性在各个阶段始终保持高水平(第一阶段：85.4；二期：80.9；p = 0.45)。会议内改变谈话的比例也一直很高(第一阶段：65.2%；二期：75.8%；p = 0.10)。本研究提供了初步证据，表明基于llm的聊天机器人可以提供心肌梗死依从性酒精干预，既可以被年轻人接受，又可以保持较高的心肌梗死保真度。未来的研究应采用随机对照设计，随访时间更长，以评估饮酒对结果的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Drug and alcohol dependence 医学-精神病学

CiteScore

7.40

自引率

7.10%

发文量

409

审稿时长

41 days

期刊介绍： Drug and Alcohol Dependence is an international journal devoted to publishing original research, scholarly reviews, commentaries, and policy analyses in the area of drug, alcohol and tobacco use and dependence. Articles range from studies of the chemistry of substances of abuse, their actions at molecular and cellular sites, in vitro and in vivo investigations of their biochemical, pharmacological and behavioural actions, laboratory-based and clinical research in humans, substance abuse treatment and prevention research, and studies employing methods from epidemiology, sociology, and economics.