Evaluating a Chatbot as a Companion for Patients With Breast Cancer: Collaborative Pilot Study.

IF 2.7 Q2 ONCOLOGY
JMIR Cancer Pub Date : 2025-08-13 DOI:10.2196/68426
Sebastian Daniel Boie, Esther Glastetter, Michael Patrick Lux, Felix Balzer, Christof von Kalle, Christian Lenz, Ulrike Müller
{"title":"Evaluating a Chatbot as a Companion for Patients With Breast Cancer: Collaborative Pilot Study.","authors":"Sebastian Daniel Boie, Esther Glastetter, Michael Patrick Lux, Felix Balzer, Christof von Kalle, Christian Lenz, Ulrike Müller","doi":"10.2196/68426","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Patients with breast cancer frequently experience significant uncertainty, prompting them to seek detailed, personalized, and reliable medical information to enhance adherence to prescribed treatments, medications, and recommended lifestyle adjustments. Although high-quality information exists within oncology guidelines and patient-oriented resources, the provision of tailored responses to individual patient queries remains challenging, especially for non-English-speaking populations.</p><p><strong>Objective: </strong>This study aims to evaluate the potential of an artificial intelligence-driven chatbot, specifically leveraging ChatGPT (GPT-4; OpenAI) combined with retrieval-augmented generation, to deliver personalized answers to complex breast cancer-related patient questions in German.</p><p><strong>Methods: </strong>We collaborated with one of Germany's largest breast cancer Patient Representation Groups to collect authentic patient inquiries, receiving a total of 118 questions. After initial screening, we selected 104 medical questions, organized into 7 distinct categories: aftercare, bone health, ductal carcinoma in situ, diagnostics, nutrition and supplements, complementary medicine, and therapy. A customized version of GPT-4 was configured with specific system prompts emphasizing empathetic, evidence-based responses and integrated with a comprehensive database comprising guidelines, recommendations, and patient information materials published by recognized German medical societies. To assess chatbot responses, we used 4 evaluation criteria: comprehensibility (clarity from a patient perspective), correctness (accuracy per current medical guidelines), completeness (inclusion of all relevant aspects), and potential harm (risk of undue patient harm or misinformation). Ratings were conducted using a 5-point Likert scale by a breast cancer expert (correctness, completeness, and potential harm) and patient representatives (comprehensibility).</p><p><strong>Results: </strong>The chatbot provided high-quality responses across multiple dimensions. Of the 499 responses evaluated for comprehensibility, 427 (85.6%) were rated as comprehensible. Among the 104 responses assessed for the remaining dimensions, 91 (87.5%) were rated as correct, 72 (69.2%) as complete, and 93 (89.4%) as nonharmful. Reasons for incomplete answers included omission of reimbursement details, updates from recent therapeutic guidelines, or nuanced recommendations regarding endocrine therapy and aftercare schedules. In addition, 6 (5.8%) of the answers were rated as potentially harmful due to outdated or contextually inappropriate recommendations. The chatbot also performed well in the nutrition and bone health categories despite occasionally incomplete document retrieval.</p><p><strong>Conclusions: </strong>Our findings demonstrate that an artificial intelligence-powered chatbot with GPT-4 and retrieval augmentation can effectively provide personalized, linguistically accessible, and largely accurate information to German-speaking patients with breast cancer. This approach holds considerable promise for improving patient-centered communication, empowering patients to make informed decisions. Nonetheless, observed limitations regarding response completeness and potential harm underscore the critical need for ongoing human oversight. Future research and development should prioritize regularly updated databases, advanced retrieval methods to handle complex document structures, multimodal capabilities, and clearly articulated disclaimers emphasizing the necessity of professional medical consultation. Our evaluation, along with the provided set of realistic patient questions, establishes a benchmark for future development and validation of German-language oncology chatbots.</p>","PeriodicalId":45538,"journal":{"name":"JMIR Cancer","volume":"11 ","pages":"e68426"},"PeriodicalIF":2.7000,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12373300/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Cancer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/68426","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Patients with breast cancer frequently experience significant uncertainty, prompting them to seek detailed, personalized, and reliable medical information to enhance adherence to prescribed treatments, medications, and recommended lifestyle adjustments. Although high-quality information exists within oncology guidelines and patient-oriented resources, the provision of tailored responses to individual patient queries remains challenging, especially for non-English-speaking populations.

Objective: This study aims to evaluate the potential of an artificial intelligence-driven chatbot, specifically leveraging ChatGPT (GPT-4; OpenAI) combined with retrieval-augmented generation, to deliver personalized answers to complex breast cancer-related patient questions in German.

Methods: We collaborated with one of Germany's largest breast cancer Patient Representation Groups to collect authentic patient inquiries, receiving a total of 118 questions. After initial screening, we selected 104 medical questions, organized into 7 distinct categories: aftercare, bone health, ductal carcinoma in situ, diagnostics, nutrition and supplements, complementary medicine, and therapy. A customized version of GPT-4 was configured with specific system prompts emphasizing empathetic, evidence-based responses and integrated with a comprehensive database comprising guidelines, recommendations, and patient information materials published by recognized German medical societies. To assess chatbot responses, we used 4 evaluation criteria: comprehensibility (clarity from a patient perspective), correctness (accuracy per current medical guidelines), completeness (inclusion of all relevant aspects), and potential harm (risk of undue patient harm or misinformation). Ratings were conducted using a 5-point Likert scale by a breast cancer expert (correctness, completeness, and potential harm) and patient representatives (comprehensibility).

Results: The chatbot provided high-quality responses across multiple dimensions. Of the 499 responses evaluated for comprehensibility, 427 (85.6%) were rated as comprehensible. Among the 104 responses assessed for the remaining dimensions, 91 (87.5%) were rated as correct, 72 (69.2%) as complete, and 93 (89.4%) as nonharmful. Reasons for incomplete answers included omission of reimbursement details, updates from recent therapeutic guidelines, or nuanced recommendations regarding endocrine therapy and aftercare schedules. In addition, 6 (5.8%) of the answers were rated as potentially harmful due to outdated or contextually inappropriate recommendations. The chatbot also performed well in the nutrition and bone health categories despite occasionally incomplete document retrieval.

Conclusions: Our findings demonstrate that an artificial intelligence-powered chatbot with GPT-4 and retrieval augmentation can effectively provide personalized, linguistically accessible, and largely accurate information to German-speaking patients with breast cancer. This approach holds considerable promise for improving patient-centered communication, empowering patients to make informed decisions. Nonetheless, observed limitations regarding response completeness and potential harm underscore the critical need for ongoing human oversight. Future research and development should prioritize regularly updated databases, advanced retrieval methods to handle complex document structures, multimodal capabilities, and clearly articulated disclaimers emphasizing the necessity of professional medical consultation. Our evaluation, along with the provided set of realistic patient questions, establishes a benchmark for future development and validation of German-language oncology chatbots.

Abstract Image

Abstract Image

Abstract Image

评估聊天机器人作为乳腺癌患者的伴侣:合作试点研究。
背景:乳腺癌患者经常经历重大的不确定性,促使他们寻求详细、个性化和可靠的医疗信息,以加强对处方治疗、药物治疗和推荐的生活方式调整的依从性。尽管肿瘤学指南和面向患者的资源中存在高质量的信息,但针对个别患者的查询提供量身定制的响应仍然具有挑战性,特别是对于非英语人群。目的:本研究旨在评估人工智能驱动的聊天机器人的潜力,特别是利用ChatGPT (GPT-4;OpenAI)与检索增强生成相结合,为复杂的乳腺癌相关患者问题提供个性化的德文答案。方法:我们与德国最大的乳腺癌患者代表组织之一合作,收集真实的患者询问,共收到118个问题。经过初步筛选,我们选择了104个医学问题,分为7个不同的类别:术后护理、骨骼健康、导管原位癌、诊断、营养和补充剂、补充医学和治疗。GPT-4的定制版本配置了特定的系统提示,强调移情、循证反应,并与由公认的德国医学协会出版的指南、建议和患者信息材料组成的综合数据库相结合。为了评估聊天机器人的反应,我们使用了4个评估标准:可理解性(从患者角度来看的清晰度)、正确性(按照现行医疗指南的准确性)、完整性(包括所有相关方面)和潜在危害(对患者造成不当伤害或错误信息的风险)。评分由乳腺癌专家(正确性、完整性和潜在危害)和患者代表(可理解性)采用5分李克特量表进行。结果:该聊天机器人在多个维度上提供了高质量的响应。在评估可理解性的499份回复中,427份(85.6%)被评为可理解。在对其余维度进行评估的104个回答中,91个(87.5%)被评为正确,72个(69.2%)被评为完整,93个(89.4%)被评为无害。不完整答案的原因包括遗漏报销细节,最近治疗指南的更新,或关于内分泌治疗和善后安排的细微建议。此外,6个(5.8%)的答案被评为潜在有害,因为过时或上下文不合适的建议。聊天机器人在营养和骨骼健康方面也表现良好,尽管偶尔会有不完整的文档检索。结论:我们的研究结果表明,具有GPT-4和检索增强功能的人工智能聊天机器人可以有效地为德语乳腺癌患者提供个性化的、语言可访问的和很大程度上准确的信息。这种方法对于改善以患者为中心的沟通,使患者能够做出明智的决定有着相当大的希望。尽管如此,观察到的应对完整性和潜在危害方面的局限性强调了持续进行人为监督的迫切需要。未来的研究和发展应优先考虑定期更新的数据库,先进的检索方法来处理复杂的文档结构,多模式的能力,以及明确表达的免责声明,强调专业医疗咨询的必要性。我们的评估,以及提供的一组现实的患者问题,为德语肿瘤聊天机器人的未来开发和验证建立了一个基准。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
JMIR Cancer
JMIR Cancer ONCOLOGY-
CiteScore
4.10
自引率
0.00%
发文量
64
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信