AI at the Forefront: Navigating Oncologic Care for Six Gastrointestinal Cancers According to the NCCN Guidelines Utilizing Gemini-1.0 Ultra and ChatGPT-4.

IF 1.9 3区 医学 Q3 ONCOLOGY
Tamir E Bresler, Tyler Wilson, Tadevos Makaryan, Shivam Pandya, Kevin Palmer, Ryan Meyer, Zin M Htway, Manabu Fujita
{"title":"AI at the Forefront: Navigating Oncologic Care for Six Gastrointestinal Cancers According to the NCCN Guidelines Utilizing Gemini-1.0 Ultra and ChatGPT-4.","authors":"Tamir E Bresler, Tyler Wilson, Tadevos Makaryan, Shivam Pandya, Kevin Palmer, Ryan Meyer, Zin M Htway, Manabu Fujita","doi":"10.1002/jso.70005","DOIUrl":null,"url":null,"abstract":"<p><strong>Background and objectives: </strong>We explored the ability of large language models (LLMs) ChatGPT-4 and Gemini 1.0 Ultra in guiding clinical decision-making for six gastrointestinal cancers using the National Comprehensive Cancer Network (NCCN) Clinical Practice Guidelines.</p><p><strong>Methods: </strong>We reviewed the NCCN Guidelines for anal squamous cell carcinoma, small bowel, ampullary, and pancreatic adenocarcinoma, and biliary tract and gastric cancers. Clinical questions were designed and categorized by type, queried up to three times, and rated on a Likert scale: (5) Correct; (4) Correct following clarification; (3) Correct but incomplete; (2) Partially incorrect; (1) Absolutely incorrect. Subgroup analysis was conducted on Correctness (scores 3-5) and Accuracy (scores 4-5).</p><p><strong>Results: </strong>A total of 270 questions were generated (range-per-cancer 32-68). ChatGPT-4 versus Gemini 1.0 Ultra score differences were not statistically-significant (Mean Rank 278.30 vs. 262.70, p = 0.222). Correctness was seen in 77.78% versus 75.93% of responses, and Accuracy in 64.81% versus 57.41%. There were no statistically-significant differences in Correctness or Accuracy between LLMs in terms of question or cancer type.</p><p><strong>Conclusions: </strong>Both LLMs demonstrated a limited capacity to assist with complex clinical decision-making. Their current Accuracy level falls below the acceptable threshold for clinical use. Future studies exploring LLMs in the healthcare domain are warranted.</p>","PeriodicalId":17111,"journal":{"name":"Journal of Surgical Oncology","volume":" ","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Surgical Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/jso.70005","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background and objectives: We explored the ability of large language models (LLMs) ChatGPT-4 and Gemini 1.0 Ultra in guiding clinical decision-making for six gastrointestinal cancers using the National Comprehensive Cancer Network (NCCN) Clinical Practice Guidelines.

Methods: We reviewed the NCCN Guidelines for anal squamous cell carcinoma, small bowel, ampullary, and pancreatic adenocarcinoma, and biliary tract and gastric cancers. Clinical questions were designed and categorized by type, queried up to three times, and rated on a Likert scale: (5) Correct; (4) Correct following clarification; (3) Correct but incomplete; (2) Partially incorrect; (1) Absolutely incorrect. Subgroup analysis was conducted on Correctness (scores 3-5) and Accuracy (scores 4-5).

Results: A total of 270 questions were generated (range-per-cancer 32-68). ChatGPT-4 versus Gemini 1.0 Ultra score differences were not statistically-significant (Mean Rank 278.30 vs. 262.70, p = 0.222). Correctness was seen in 77.78% versus 75.93% of responses, and Accuracy in 64.81% versus 57.41%. There were no statistically-significant differences in Correctness or Accuracy between LLMs in terms of question or cancer type.

Conclusions: Both LLMs demonstrated a limited capacity to assist with complex clinical decision-making. Their current Accuracy level falls below the acceptable threshold for clinical use. Future studies exploring LLMs in the healthcare domain are warranted.

最前沿的人工智能:根据NCCN指南,利用Gemini-1.0 Ultra和ChatGPT-4为六种胃肠道癌症导航肿瘤护理
背景和目的:我们根据国家综合癌症网络(NCCN)临床实践指南,探讨了大型语言模型(LLMs) ChatGPT-4和Gemini 1.0 Ultra在指导六种胃肠道癌症临床决策方面的能力。方法:我们回顾了NCCN关于肛门鳞状细胞癌、小肠、壶腹癌、胰腺腺癌、胆道癌和胃癌的指南。临床问题按类型设计和分类,最多查询三次,并按李克特量表评分:(5)正确;(4)纠正后续澄清;(3)正确但不完整的;(2)部分错误;(1)完全错误。对正确性(3-5分)和准确性(4-5分)进行亚组分析。结果:共生成了270个问题(范围-每个癌症32-68)。ChatGPT-4与Gemini 1.0 Ultra评分差异无统计学意义(平均排名278.30比262.70,p = 0.222)。正确率分别为77.78%和75.93%,准确率分别为64.81%和57.41%。在问题或癌症类型方面,法学硕士之间的正确性或准确性没有统计学上的显著差异。结论:这两个llm在复杂的临床决策方面表现出有限的能力。他们目前的准确度水平低于临床使用的可接受阈值。未来的研究探索法学硕士在医疗保健领域是必要的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.70
自引率
4.00%
发文量
367
审稿时长
2 months
期刊介绍: The Journal of Surgical Oncology offers peer-reviewed, original papers in the field of surgical oncology and broadly related surgical sciences, including reports on experimental and laboratory studies. As an international journal, the editors encourage participation from leading surgeons around the world. The JSO is the representative journal for the World Federation of Surgical Oncology Societies. Publishing 16 issues in 2 volumes each year, the journal accepts Research Articles, in-depth Reviews of timely interest, Letters to the Editor, and invited Editorials. Guest Editors from the JSO Editorial Board oversee multiple special Seminars issues each year. These Seminars include multifaceted Reviews on a particular topic or current issue in surgical oncology, which are invited from experts in the field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信