Dr. LLM Will See You Now: The Ability of ChatGPT to Provide Geographically Tailored Colorectal Cancer Screening and Surveillance Recommendations.

IF 2.9 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL
Aisling Zeng, Jacqueline Steinke, Horea-Florin Bocse, Matteo De Pastena
{"title":"Dr. LLM Will See You Now: The Ability of ChatGPT to Provide Geographically Tailored Colorectal Cancer Screening and Surveillance Recommendations.","authors":"Aisling Zeng, Jacqueline Steinke, Horea-Florin Bocse, Matteo De Pastena","doi":"10.3390/jcm14145101","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background/Objectives</b>: This study evaluates the performance of a large language model (lLm) in providing geographically tailored colorectal cancer screening and surveillance recommendations to gastrointestinal surgeons. <b>Methods</b>: Fifty-four patient cases, varying by age and family history, were developed based on colorectal cancer guidelines. Standardized prompts with predefined query terms were used to query ChatGPT-4.5 on 18 April 2025, from four locations: Canada, Italy, Romania, and the United Kingdom. Responses were classified as \"Correct,\" \"Partially Correct,\" or \"Incorrect\" based on clinical guidelines and expert recommendations for each country. Outcomes were analyzed using descriptive statistics. <b>Results</b>: ChatGPT provided recommendations on screening eligibility, test interpretation, the management of positive results, and surveillance intervals. Correct recommendations were given for 50.0% (27/54) of cases in Canada, 63.0% (34/54) of cases in Italy, 40.7% (22/54) of cases in Romania, and 55.6% (30/54) of cases in the United Kingdom. Queries in Italian yielded correct guidance for 64.8% (35/54) of cases, while Romanian queries were accurate for 40.7% (22/54) of cases. Notably, Romania and Italy lacked detailed guidelines for polyp management and post-test surveillance. A key finding was the inconsistency between ChatGPT-generated titles and corresponding recommendations, which may impact its reliability in clinical decision-making. <b>Conclusions</b>: ChatGPT-4.5's performance varies by country and language, highlighting inconsistencies in geographically tailored recommendations. This study highlights limitations associated with the training data cutoff and the potential biases introduced by model-generated responses. Healthcare professionals should recognize these limitations and the possible gaps in guideline availability, particularly for high-risk screening, polyp management, and surveillance in certain European countries.</p>","PeriodicalId":15533,"journal":{"name":"Journal of Clinical Medicine","volume":"14 14","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12294925/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/jcm14145101","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background/Objectives: This study evaluates the performance of a large language model (lLm) in providing geographically tailored colorectal cancer screening and surveillance recommendations to gastrointestinal surgeons. Methods: Fifty-four patient cases, varying by age and family history, were developed based on colorectal cancer guidelines. Standardized prompts with predefined query terms were used to query ChatGPT-4.5 on 18 April 2025, from four locations: Canada, Italy, Romania, and the United Kingdom. Responses were classified as "Correct," "Partially Correct," or "Incorrect" based on clinical guidelines and expert recommendations for each country. Outcomes were analyzed using descriptive statistics. Results: ChatGPT provided recommendations on screening eligibility, test interpretation, the management of positive results, and surveillance intervals. Correct recommendations were given for 50.0% (27/54) of cases in Canada, 63.0% (34/54) of cases in Italy, 40.7% (22/54) of cases in Romania, and 55.6% (30/54) of cases in the United Kingdom. Queries in Italian yielded correct guidance for 64.8% (35/54) of cases, while Romanian queries were accurate for 40.7% (22/54) of cases. Notably, Romania and Italy lacked detailed guidelines for polyp management and post-test surveillance. A key finding was the inconsistency between ChatGPT-generated titles and corresponding recommendations, which may impact its reliability in clinical decision-making. Conclusions: ChatGPT-4.5's performance varies by country and language, highlighting inconsistencies in geographically tailored recommendations. This study highlights limitations associated with the training data cutoff and the potential biases introduced by model-generated responses. Healthcare professionals should recognize these limitations and the possible gaps in guideline availability, particularly for high-risk screening, polyp management, and surveillance in certain European countries.

LLM博士现在见您:ChatGPT提供地理定制结直肠癌筛查和监测建议的能力。
背景/目的:本研究评估了大型语言模型(lLm)在为胃肠道外科医生提供地理定制的结直肠癌筛查和监测建议方面的表现。方法:根据结直肠癌指南对54例年龄和家族史不同的患者进行研究。2025年4月18日,使用带有预定义查询词的标准化提示从四个位置查询ChatGPT-4.5:加拿大、意大利、罗马尼亚和英国。根据每个国家的临床指南和专家建议,回答被分为“正确”、“部分正确”或“不正确”。结果采用描述性统计进行分析。结果:ChatGPT提供了筛选资格、测试解释、阳性结果管理和监测间隔的建议。加拿大50.0%(27/54)、意大利63.0%(34/54)、罗马尼亚40.7%(22/54)和英国55.6%(30/54)的病例给出了正确的建议。意大利语查询的正确率为64.8%(35/54),而罗马尼亚语查询的正确率为40.7%(22/54)。值得注意的是,罗马尼亚和意大利缺乏息肉管理和检测后监测的详细指导方针。一个关键的发现是chatgpt生成的标题和相应的推荐之间的不一致,这可能会影响其在临床决策中的可靠性。结论:ChatGPT-4.5的表现因国家和语言而异,突出了地理定制建议的不一致性。这项研究强调了与训练数据截断和模型生成的响应引入的潜在偏差相关的局限性。医疗保健专业人员应该认识到这些局限性和指南可用性的可能差距,特别是在某些欧洲国家的高风险筛查、息肉管理和监测方面。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Clinical Medicine
Journal of Clinical Medicine MEDICINE, GENERAL & INTERNAL-
CiteScore
5.70
自引率
7.70%
发文量
6468
审稿时长
16.32 days
期刊介绍: Journal of Clinical Medicine (ISSN 2077-0383), is an international scientific open access journal, providing a platform for advances in health care/clinical practices, the study of direct observation of patients and general medical research. This multi-disciplinary journal is aimed at a wide audience of medical researchers and healthcare professionals. Unique features of this journal: manuscripts regarding original research and ideas will be particularly welcomed.JCM also accepts reviews, communications, and short notes. There is no limit to publication length: our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信