Bots in white coats: are large language models the future of patient education? A multicenter cross-sectional analysis.

IF 12.5 2区 医学 Q1 SURGERY
Ughur Aghamaliyev, Javad Karimbayli, Athanasios Zamparas, Florian Bösch, Michael Thomas, Thomas Schmidt, Christian Krautz, Christoph Kahlert, Sebastian Schölch, Martin K Angele, Hanno Niess, Markus O Guba, Jens Werner, Matthias Ilmer, Bernhard W Renz
{"title":"Bots in white coats: are large language models the future of patient education? A multicenter cross-sectional analysis.","authors":"Ughur Aghamaliyev, Javad Karimbayli, Athanasios Zamparas, Florian Bösch, Michael Thomas, Thomas Schmidt, Christian Krautz, Christoph Kahlert, Sebastian Schölch, Martin K Angele, Hanno Niess, Markus O Guba, Jens Werner, Matthias Ilmer, Bernhard W Renz","doi":"10.1097/JS9.0000000000002250","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Every year, around 300 million surgeries are conducted worldwide, with an estimated 4.2 million deaths occurring within 30 days after surgery. Adequate patient education is crucial, but often falls short due to the stress patients experience before surgery. Large language models (LLMs) can significantly enhance this process by delivering thorough information and addressing patient concerns that might otherwise go unnoticed.</p><p><strong>Material and methods: </strong>This cross-sectional study evaluated Chat Generative Pretrained Transformer-4o's audio-based responses to frequently asked questions (FAQs) regarding six general surgical procedures. Three experienced surgeons and two senior residents formulated seven general and three procedure-specific FAQs for both preoperative and postoperative situations, covering six surgical scenarios (major: pancreatic head resection, rectal resection, total gastrectomy; minor: cholecystectomy, Lichtenstein procedure, hemithyroidectomy). In total, 120 audio responses were generated, transcribed, and assessed by 11 surgeons from 6 different German university hospitals.</p><p><strong>Results: </strong>ChatGPT-4o demonstrated strong performance, achieving an average score of 4.12/5 for accuracy, 4.46/5 for relevance, and 0.22/5 for potential harm across 120 questions. Postoperative responses surpassed preoperative ones in both accuracy and relevance, while also exhibiting lower potential for harm. Additionally, responses related to minor surgeries were minimal, but significantly more accurate compared to those for major surgeries.</p><p><strong>Conclusions: </strong>This study underscores GPT-4o's potential to enhance patient education both before and after surgery by delivering accurate and relevant responses to FAQs about various surgical procedures. Responses regarding the postoperative course proved to be more accurate and less harmful than those addressing preoperative ones. Although a few responses carried moderate risks, the overall performance was robust, indicating GPT-4o's value in patient education. The study suggests the development of hospital-specific applications or the integration of GPT-4o into interactive robotic systems to provide patients with reliable, immediate answers, thereby improving patient satisfaction and informed decision-making.</p>","PeriodicalId":14401,"journal":{"name":"International journal of surgery","volume":" ","pages":"2376-2384"},"PeriodicalIF":12.5000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/JS9.0000000000002250","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: Every year, around 300 million surgeries are conducted worldwide, with an estimated 4.2 million deaths occurring within 30 days after surgery. Adequate patient education is crucial, but often falls short due to the stress patients experience before surgery. Large language models (LLMs) can significantly enhance this process by delivering thorough information and addressing patient concerns that might otherwise go unnoticed.

Material and methods: This cross-sectional study evaluated Chat Generative Pretrained Transformer-4o's audio-based responses to frequently asked questions (FAQs) regarding six general surgical procedures. Three experienced surgeons and two senior residents formulated seven general and three procedure-specific FAQs for both preoperative and postoperative situations, covering six surgical scenarios (major: pancreatic head resection, rectal resection, total gastrectomy; minor: cholecystectomy, Lichtenstein procedure, hemithyroidectomy). In total, 120 audio responses were generated, transcribed, and assessed by 11 surgeons from 6 different German university hospitals.

Results: ChatGPT-4o demonstrated strong performance, achieving an average score of 4.12/5 for accuracy, 4.46/5 for relevance, and 0.22/5 for potential harm across 120 questions. Postoperative responses surpassed preoperative ones in both accuracy and relevance, while also exhibiting lower potential for harm. Additionally, responses related to minor surgeries were minimal, but significantly more accurate compared to those for major surgeries.

Conclusions: This study underscores GPT-4o's potential to enhance patient education both before and after surgery by delivering accurate and relevant responses to FAQs about various surgical procedures. Responses regarding the postoperative course proved to be more accurate and less harmful than those addressing preoperative ones. Although a few responses carried moderate risks, the overall performance was robust, indicating GPT-4o's value in patient education. The study suggests the development of hospital-specific applications or the integration of GPT-4o into interactive robotic systems to provide patients with reliable, immediate answers, thereby improving patient satisfaction and informed decision-making.

求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
17.70
自引率
3.30%
发文量
0
审稿时长
6-12 weeks
期刊介绍: The International Journal of Surgery (IJS) has a broad scope, encompassing all surgical specialties. Its primary objective is to facilitate the exchange of crucial ideas and lines of thought between and across these specialties.By doing so, the journal aims to counter the growing trend of increasing sub-specialization, which can result in "tunnel-vision" and the isolation of significant surgical advancements within specific specialties.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信