Prompt Engineering an Informational Chatbot for Education on Mental Health Using a Multiagent Approach for Enhanced Compliance With Prompt Instructions: Algorithm Development and Validation.

JMIR AI Pub Date : 2025-03-26 DOI:10.2196/69820

Per Niklas Waaler, Musarrat Hussain, Igor Molchanov, Lars Ailo Bongo, Brita Elvevåg

{"title":"Prompt Engineering an Informational Chatbot for Education on Mental Health Using a Multiagent Approach for Enhanced Compliance With Prompt Instructions: Algorithm Development and Validation.","authors":"Per Niklas Waaler, Musarrat Hussain, Igor Molchanov, Lars Ailo Bongo, Brita Elvevåg","doi":"10.2196/69820","DOIUrl":null,"url":null,"abstract":"Background: People with schizophrenia often present with cognitive impairments that may hinder their ability to learn about their condition. Education platforms powered by large language models (LLMs) have the potential to improve the accessibility of mental health information. However, the black-box nature of LLMs raises ethical and safety concerns regarding the controllability of chatbots. In particular, prompt-engineered chatbots may drift from their intended role as the conversation progresses and become more prone to hallucinations.Objective: This study aimed to develop and evaluate a critical analysis filter (CAF) system that ensures that an LLM-powered prompt-engineered chatbot reliably complies with its predefined instructions and scope while delivering validated mental health information.Methods: For a proof of concept, we prompt engineered an educational chatbot for schizophrenia powered by GPT-4 that could dynamically access information from a schizophrenia manual written for people with schizophrenia and their caregivers. In the CAF, a team of prompt-engineered LLM agents was used to critically analyze and refine the chatbot's responses and deliver real-time feedback to the chatbot. To assess the ability of the CAF to re-establish the chatbot's adherence to its instructions, we generated 3 conversations (by conversing with the chatbot with the CAF disabled) wherein the chatbot started to drift from its instructions toward various unintended roles. We used these checkpoint conversations to initialize automated conversations between the chatbot and adversarial chatbots designed to entice it toward unintended roles. Conversations were repeatedly sampled with the CAF enabled and disabled. In total, 3 human raters independently rated each chatbot response according to criteria developed to measure the chatbot's integrity, specifically, its transparency (such as admitting when a statement lacked explicit support from its scripted sources) and its tendency to faithfully convey the scripted information in the schizophrenia manual.Results: In total, 36 responses (3 different checkpoint conversations, 3 conversations per checkpoint, and 4 adversarial queries per conversation) were rated for compliance with the CAF enabled and disabled. Activating the CAF resulted in a compliance score that was considered acceptable (≥2) in 81% (7/36) of the responses, compared to only 8.3% (3/36) when the CAF was deactivated.Conclusions: Although more rigorous testing in realistic scenarios is needed, our results suggest that self-reflection mechanisms could enable LLMs to be used effectively and safely in educational mental health platforms. This approach harnesses the flexibility of LLMs while reliably constraining their scope to appropriate and accurate interactions.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":" ","pages":"e69820"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11982747/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/69820","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Background: People with schizophrenia often present with cognitive impairments that may hinder their ability to learn about their condition. Education platforms powered by large language models (LLMs) have the potential to improve the accessibility of mental health information. However, the black-box nature of LLMs raises ethical and safety concerns regarding the controllability of chatbots. In particular, prompt-engineered chatbots may drift from their intended role as the conversation progresses and become more prone to hallucinations.

Objective: This study aimed to develop and evaluate a critical analysis filter (CAF) system that ensures that an LLM-powered prompt-engineered chatbot reliably complies with its predefined instructions and scope while delivering validated mental health information.

Methods: For a proof of concept, we prompt engineered an educational chatbot for schizophrenia powered by GPT-4 that could dynamically access information from a schizophrenia manual written for people with schizophrenia and their caregivers. In the CAF, a team of prompt-engineered LLM agents was used to critically analyze and refine the chatbot's responses and deliver real-time feedback to the chatbot. To assess the ability of the CAF to re-establish the chatbot's adherence to its instructions, we generated 3 conversations (by conversing with the chatbot with the CAF disabled) wherein the chatbot started to drift from its instructions toward various unintended roles. We used these checkpoint conversations to initialize automated conversations between the chatbot and adversarial chatbots designed to entice it toward unintended roles. Conversations were repeatedly sampled with the CAF enabled and disabled. In total, 3 human raters independently rated each chatbot response according to criteria developed to measure the chatbot's integrity, specifically, its transparency (such as admitting when a statement lacked explicit support from its scripted sources) and its tendency to faithfully convey the scripted information in the schizophrenia manual.

Results: In total, 36 responses (3 different checkpoint conversations, 3 conversations per checkpoint, and 4 adversarial queries per conversation) were rated for compliance with the CAF enabled and disabled. Activating the CAF resulted in a compliance score that was considered acceptable (≥2) in 81% (7/36) of the responses, compared to only 8.3% (3/36) when the CAF was deactivated.

Conclusions: Although more rigorous testing in realistic scenarios is needed, our results suggest that self-reflection mechanisms could enable LLMs to be used effectively and safely in educational mental health platforms. This approach harnesses the flexibility of LLMs while reliably constraining their scope to appropriate and accurate interactions.

查看原文本刊更多论文

提示工程：一个用于心理健康教育的信息聊天机器人：利用多智能体方法增强对提示指令的遵从性。

背景：精神分裂症患者通常表现为认知障碍，这可能会阻碍他们了解自己病情的能力。由大型语言模型（llm）支持的教育平台有可能改善心理健康信息的可及性。然而，法学硕士的黑箱性质引发了人们对聊天机器人可控性的道德和安全担忧。特别是，即时设计的聊天机器人可能会随着对话的进行而偏离其预期的角色，变得更容易产生幻觉。目的：开发和评估关键分析过滤器（CAF）系统，确保llm驱动的即时工程聊天机器人可靠地遵守预定义的指令和范围，同时提供经过验证的心理健康信息。方法：为了验证概念，我们及时设计了一个由GPT-4驱动的教育精神分裂症聊天机器人，该机器人可以动态访问为精神分裂症患者和护理人员编写的精神分裂症手册中的信息。在CAF中，一个由即时工程LLM代理组成的团队用于批判性地分析和改进聊天机器人的响应，并向聊天机器人提供实时反馈。为了评估CAF重新建立聊天机器人遵守其指令的能力，我们生成了三个对话（通过与禁用CAF的聊天机器人交谈），其中聊天机器人开始从其指令漂移到各种意想不到的角色。我们使用这些检查点对话来初始化聊天机器人和对抗性聊天机器人之间的自动对话，这些对话旨在吸引它进入意想不到的角色。分别在启用和禁用CAF的情况下对会话进行重复采样。根据衡量聊天机器人完整性的标准，三名人类评分员独立地对每个聊天机器人的反应进行评分；具体来说，它的透明度（例如，当一个陈述缺乏其脚本来源的明确支持时，它会承认）以及它忠实地传达精神分裂症手册中脚本信息的倾向。结果：总共36个响应（3个不同的检查点对话，每个检查点3个对话，每个对话4个对抗性查询）分别被评为启用和禁用CAF的依从性。激活CAF的依从性评分67.0%被认为是可接受的（≥2），而停用CAF的依从性评分只有8.7%。结论：虽然需要在现实场景中进行更严格的测试，但我们的研究结果表明，自我反思机制可以使llm在教育心理健康平台中有效和安全地使用。这种方法利用了法学硕士的灵活性，同时可靠地将其范围限制在适当和准确的相互作用上。临床试验:

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JMIR AI

自引率

0.00%

发文量