Brian Suffoletto , Duncan B. Clark , Christine Lee , Michael Mason , Jordan Schultz , Irvin Szeto , Denise Walker
{"title":"Development and preliminary testing of a secure large language model-based chatbot for brief alcohol counseling in young adults","authors":"Brian Suffoletto , Duncan B. Clark , Christine Lee , Michael Mason , Jordan Schultz , Irvin Szeto , Denise Walker","doi":"10.1016/j.drugalcdep.2025.112697","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Young adults face elevated risks from alcohol use yet encounter significant barriers to accessing evidence-based interventions. Large language models (LLMs) represent a promising advancement for delivering personalized behavioral interventions, but their application to alcohol counseling remains unexplored. This study evaluated the development and preliminary outcomes of a Secure GPT-4-powered text-based Motivational Interviewing Conversational Agent (MICA).</div></div><div><h3>Method</h3><div>Using a prospective single-arm pilot design, we evaluated MICA across two phases (Phase I: n = 8; Phase II: n = 37), editing the LLM prompts between Phases. Participants aged 18–25 who reported consuming ≥ 10 standard alcohol units weekly completed a counseling session with MICA. We evaluated safety and compared MI fidelity (relational and technical sub-scales of the Client Evaluation of MI [CEMI]) and usability (System Usability Scale) between Phases. We also explored surrogate measures of effectiveness (i.e. proportion of change talk to sustain talk from session logs) and qualitative feedback themes.</div></div><div><h3>Results</h3><div>No unsafe responses were observed. MI fidelity improved significantly in the CEMI relational sub-scale from Phase I to II (67.2 % to 82.6 %, p = 0.03). Usability remained consistently high across phases (Phase I: 85.4; Phase II: 80.9; p = 0.45). The proportion of within-session change talk was also consistently high (Phase I: 65.2 %; Phase II: 75.8 %; p = 0.10).</div></div><div><h3>Conclusions</h3><div>This study provides preliminary evidence that LLM-based chatbots can deliver MI-adherent alcohol interventions that are both acceptable to young adults and maintain high MI fidelity. Future research should employ randomized controlled designs with longer follow-up periods to evaluate impact on drinking outcomes.</div></div>","PeriodicalId":11322,"journal":{"name":"Drug and alcohol dependence","volume":"272 ","pages":"Article 112697"},"PeriodicalIF":3.9000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Drug and alcohol dependence","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0376871625001504","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
Young adults face elevated risks from alcohol use yet encounter significant barriers to accessing evidence-based interventions. Large language models (LLMs) represent a promising advancement for delivering personalized behavioral interventions, but their application to alcohol counseling remains unexplored. This study evaluated the development and preliminary outcomes of a Secure GPT-4-powered text-based Motivational Interviewing Conversational Agent (MICA).
Method
Using a prospective single-arm pilot design, we evaluated MICA across two phases (Phase I: n = 8; Phase II: n = 37), editing the LLM prompts between Phases. Participants aged 18–25 who reported consuming ≥ 10 standard alcohol units weekly completed a counseling session with MICA. We evaluated safety and compared MI fidelity (relational and technical sub-scales of the Client Evaluation of MI [CEMI]) and usability (System Usability Scale) between Phases. We also explored surrogate measures of effectiveness (i.e. proportion of change talk to sustain talk from session logs) and qualitative feedback themes.
Results
No unsafe responses were observed. MI fidelity improved significantly in the CEMI relational sub-scale from Phase I to II (67.2 % to 82.6 %, p = 0.03). Usability remained consistently high across phases (Phase I: 85.4; Phase II: 80.9; p = 0.45). The proportion of within-session change talk was also consistently high (Phase I: 65.2 %; Phase II: 75.8 %; p = 0.10).
Conclusions
This study provides preliminary evidence that LLM-based chatbots can deliver MI-adherent alcohol interventions that are both acceptable to young adults and maintain high MI fidelity. Future research should employ randomized controlled designs with longer follow-up periods to evaluate impact on drinking outcomes.
期刊介绍:
Drug and Alcohol Dependence is an international journal devoted to publishing original research, scholarly reviews, commentaries, and policy analyses in the area of drug, alcohol and tobacco use and dependence. Articles range from studies of the chemistry of substances of abuse, their actions at molecular and cellular sites, in vitro and in vivo investigations of their biochemical, pharmacological and behavioural actions, laboratory-based and clinical research in humans, substance abuse treatment and prevention research, and studies employing methods from epidemiology, sociology, and economics.