Evaluating the role of AI chatbots in patient education for abdominal aortic aneurysms: a comparison of ChatGPT and conventional resources

IF 1.5 4区 医学 Q3 SURGERY
Harry Collin MD, Chelsea Tong MBChb, Abhishekh Srinivas MD, Angus Pegler MD, Philip Allan MB ChB, Daniel Hagley MBBS FRACS
{"title":"Evaluating the role of AI chatbots in patient education for abdominal aortic aneurysms: a comparison of ChatGPT and conventional resources","authors":"Harry Collin MD,&nbsp;Chelsea Tong MBChb,&nbsp;Abhishekh Srinivas MD,&nbsp;Angus Pegler MD,&nbsp;Philip Allan MB ChB,&nbsp;Daniel Hagley MBBS FRACS","doi":"10.1111/ans.70053","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Backgrounds</h3>\n \n <p>Abdominal aortic aneurysms (AAA) carry significant risks, yet patient understanding is often limited, with online resources typically low quality. ChatGPT, an artificial intelligence (AI) chatbot, presents a new frontier in patient education, but concerns remain about misinformation. This study evaluates the quality of ChatGPT-generated patient information on AAA.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>Eight patient questions on AAA were sourced from a reputable online resource for patient information funded by the Australian Government's Healthdirect Australia (HDA) website and input into ChatGPT's free (ChatGPT-4o mini) and paid (ChatGPT-4) models. A vascular surgeon evaluated response appropriateness. Readability was assessed using the Flesch–Kincaid test. The Patient Education Materials Assessment Tool (PEMAT) measured understandability and actionability, with responses scoring ≥75% for both considered high-quality.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>All responses were deemed clinically appropriate. Mean response length was longer for ChatGPT than HDA. Readability was at a college level for ChatGPT, while HDA was at a 10th to 12th-grade level. One response was high-quality (generated by paid ChatGPT) with a PEMAT actionability score of ≥75%. Actionability scores were otherwise low across all sources with ChatGPT responses more likely to contain identifiable actions, although these were often not clearly presented. ChatGPT responses were marginally more understandable than HDA.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>ChatGPT-generated information on AAA was appropriate and understandable, outperforming HDA in both aspects. However, AI responses are at a more advanced reading level and lack actionable instructions. AI chatbots show promise as supplemental tools for AAA patient education, but further refinement is needed to enhance their effectiveness in supporting informed decision-making.</p>\n </section>\n </div>","PeriodicalId":8158,"journal":{"name":"ANZ Journal of Surgery","volume":"95 4","pages":"784-788"},"PeriodicalIF":1.5000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ans.70053","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ANZ Journal of Surgery","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/ans.70053","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0

Abstract

Backgrounds

Abdominal aortic aneurysms (AAA) carry significant risks, yet patient understanding is often limited, with online resources typically low quality. ChatGPT, an artificial intelligence (AI) chatbot, presents a new frontier in patient education, but concerns remain about misinformation. This study evaluates the quality of ChatGPT-generated patient information on AAA.

Methods

Eight patient questions on AAA were sourced from a reputable online resource for patient information funded by the Australian Government's Healthdirect Australia (HDA) website and input into ChatGPT's free (ChatGPT-4o mini) and paid (ChatGPT-4) models. A vascular surgeon evaluated response appropriateness. Readability was assessed using the Flesch–Kincaid test. The Patient Education Materials Assessment Tool (PEMAT) measured understandability and actionability, with responses scoring ≥75% for both considered high-quality.

Results

All responses were deemed clinically appropriate. Mean response length was longer for ChatGPT than HDA. Readability was at a college level for ChatGPT, while HDA was at a 10th to 12th-grade level. One response was high-quality (generated by paid ChatGPT) with a PEMAT actionability score of ≥75%. Actionability scores were otherwise low across all sources with ChatGPT responses more likely to contain identifiable actions, although these were often not clearly presented. ChatGPT responses were marginally more understandable than HDA.

Conclusions

ChatGPT-generated information on AAA was appropriate and understandable, outperforming HDA in both aspects. However, AI responses are at a more advanced reading level and lack actionable instructions. AI chatbots show promise as supplemental tools for AAA patient education, but further refinement is needed to enhance their effectiveness in supporting informed decision-making.

评估人工智能聊天机器人在腹主动脉瘤患者教育中的作用:ChatGPT与传统资源的比较
背景:腹主动脉瘤(AAA)具有显著的风险,但患者的理解往往有限,在线资源通常质量较低。人工智能(AI)聊天机器人ChatGPT在患者教育方面开辟了一个新领域,但人们仍然担心错误信息。本研究评估了ChatGPT在AAA上生成的患者信息的质量。方法:八个关于AAA的患者问题来自澳大利亚政府Healthdirect Australia (HDA)网站资助的一个声誉良好的在线患者信息资源,并输入ChatGPT的免费(ChatGPT- 40 mini)和付费(ChatGPT-4)模型。一位血管外科医生评估了反应的适当性。使用Flesch-Kincaid测试评估可读性。患者教育材料评估工具(PEMAT)测量了可理解性和可操作性,反应得分≥75%被认为是高质量的。结果:所有反应均被认为是临床适宜的。ChatGPT的平均反应时间长于HDA。ChatGPT的可读性达到了大学水平,而HDA则达到了10到12年级的水平。一个反应是高质量的(由付费ChatGPT生成),PEMAT可操作性评分≥75%。可操作性得分在所有来源中都很低,ChatGPT的响应更有可能包含可识别的操作,尽管这些操作通常没有被清楚地呈现出来。ChatGPT的回答比HDA更容易理解。结论:chatgpt生成的AAA信息是恰当且可理解的,在这两个方面都优于HDA。然而,人工智能的反应处于更高级的阅读水平,缺乏可操作的指令。人工智能聊天机器人有望成为AAA级患者教育的补充工具,但需要进一步改进,以提高其在支持知情决策方面的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ANZ Journal of Surgery
ANZ Journal of Surgery 医学-外科
CiteScore
2.50
自引率
11.80%
发文量
720
审稿时长
2 months
期刊介绍: ANZ Journal of Surgery is published by Wiley on behalf of the Royal Australasian College of Surgeons to provide a medium for the publication of peer-reviewed original contributions related to clinical practice and/or research in all fields of surgery and related disciplines. It also provides a programme of continuing education for surgeons. All articles are peer-reviewed by at least two researchers expert in the field of the submitted paper.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信