评估 ChatGPT 处理幽门螺旋杆菌感染相关问题的准确性：全国调查与比较研究。

IF 4.3 2区医学 Q1 GASTROENTEROLOGY & HEPATOLOGY

Helicobacter Pub Date : 2024-07-30 DOI:10.1111/hel.13116

Yi Hu, Yongkang Lai, Foqiang Liao, Xu Shu, Yin Zhu, Yi-Qi Du, Nong-Hua Lu, National Clinical Research Center for Digestive Diseases (Shanghai)

{"title":"评估 ChatGPT 处理幽门螺旋杆菌感染相关问题的准确性：全国调查与比较研究。","authors":"Yi Hu, Yongkang Lai, Foqiang Liao, Xu Shu, Yin Zhu, Yi-Qi Du, Nong-Hua Lu, National Clinical Research Center for Digestive Diseases (Shanghai)","doi":"10.1111/hel.13116","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>ChatGPT is a novel and online large-scale language model used as a source providing up-to-date and useful health-related knowledges to patients and clinicians. However, its performance on <i>Helicobacter pylori</i> infection-related questions remain unknown. This study aimed to evaluate the accuracy of ChatGPT's responses on <i>H. pylori</i>-related questions compared with that of gastroenterologists during the same period.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>Twenty-five <i>H. pylori</i>-related questions from five domains: Indication, Diagnostics, Treatment, Gastric cancer and prevention, and Gut Microbiota were selected based on the Maastricht VI Consensus report. Each question was tested three times with ChatGPT3.5 and ChatGPT4. Two independent <i>H. pylori</i> experts assessed the responses from ChatGPT, with discrepancies resolved by a third reviewer. Simultaneously, a nationwide survey with the same questions was conducted among 1279 gastroenterologists and 154 medical students. The accuracy of responses from ChatGPT3.5 and ChatGPT4 was compared with that of gastroenterologists.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Overall, both ChatGPT3.5 and ChatGPT4 demonstrated high accuracy, with median accuracy rates of 92% for each of the three responses, surpassing the accuracy of nationwide gastroenterologists (median: 80%) and equivalent to that of senior gastroenterologists. Compared with ChatGPT3.5, ChatGPT4 provided more concise responses with the same accuracy. ChatGPT3.5 performed well in the Indication, Treatment, and Gut Microbiota domains, whereas ChatGPT4 excelled in Diagnostics, Gastric cancer and prevention, and Gut Microbiota domains.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>ChatGPT exhibited high accuracy and reproducibility in addressing <i>H. pylori</i>-related questions except the decision for <i>H. pylori</i> treatment, performing at the level of senior gastroenterologists and could serve as an auxiliary information tool for assisting patients and clinicians.</p>\n </section>\n </div>","PeriodicalId":13223,"journal":{"name":"Helicobacter","volume":"29 4","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessing Accuracy of ChatGPT on Addressing Helicobacter pylori Infection-Related Questions: A National Survey and Comparative Study\",\"authors\":\"Yi Hu, Yongkang Lai, Foqiang Liao, Xu Shu, Yin Zhu, Yi-Qi Du, Nong-Hua Lu, National Clinical Research Center for Digestive Diseases (Shanghai)\",\"doi\":\"10.1111/hel.13116\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>ChatGPT is a novel and online large-scale language model used as a source providing up-to-date and useful health-related knowledges to patients and clinicians. However, its performance on <i>Helicobacter pylori</i> infection-related questions remain unknown. This study aimed to evaluate the accuracy of ChatGPT's responses on <i>H. pylori</i>-related questions compared with that of gastroenterologists during the same period.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>Twenty-five <i>H. pylori</i>-related questions from five domains: Indication, Diagnostics, Treatment, Gastric cancer and prevention, and Gut Microbiota were selected based on the Maastricht VI Consensus report. Each question was tested three times with ChatGPT3.5 and ChatGPT4. Two independent <i>H. pylori</i> experts assessed the responses from ChatGPT, with discrepancies resolved by a third reviewer. Simultaneously, a nationwide survey with the same questions was conducted among 1279 gastroenterologists and 154 medical students. The accuracy of responses from ChatGPT3.5 and ChatGPT4 was compared with that of gastroenterologists.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>Overall, both ChatGPT3.5 and ChatGPT4 demonstrated high accuracy, with median accuracy rates of 92% for each of the three responses, surpassing the accuracy of nationwide gastroenterologists (median: 80%) and equivalent to that of senior gastroenterologists. Compared with ChatGPT3.5, ChatGPT4 provided more concise responses with the same accuracy. ChatGPT3.5 performed well in the Indication, Treatment, and Gut Microbiota domains, whereas ChatGPT4 excelled in Diagnostics, Gastric cancer and prevention, and Gut Microbiota domains.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusion</h3>\\n \\n <p>ChatGPT exhibited high accuracy and reproducibility in addressing <i>H. pylori</i>-related questions except the decision for <i>H. pylori</i> treatment, performing at the level of senior gastroenterologists and could serve as an auxiliary information tool for assisting patients and clinicians.</p>\\n </section>\\n </div>\",\"PeriodicalId\":13223,\"journal\":{\"name\":\"Helicobacter\",\"volume\":\"29 4\",\"pages\":\"\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Helicobacter\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/hel.13116\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GASTROENTEROLOGY & HEPATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Helicobacter","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/hel.13116","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

背景ChatGPT 是一种新颖的在线大规模语言模型，可为患者和临床医生提供最新、有用的健康相关知识。然而，它在幽门螺旋杆菌感染相关问题上的表现仍不为人所知。本研究旨在评估 ChatGPT 对幽门螺杆菌相关问题的回答与同期消化科医生回答的准确性：方法：25 个幽门螺杆菌相关问题来自五个领域：方法：根据马斯特里赫特第六次共识报告，从五个领域（适应症、诊断、治疗、胃癌和预防以及肠道微生物群）中选择了 25 个幽门螺杆菌相关问题。每个问题都用 ChatGPT3.5 和 ChatGPT4 测试了三次。两位独立的幽门螺杆菌专家对 ChatGPT 的回答进行了评估，不一致之处由第三位审查员解决。同时，在全国范围内对 1279 名消化科医生和 154 名医科学生进行了相同问题的调查。结果显示，ChatGPT3.5 和 ChatGPT4 的准确性与消化科医生的准确性进行了比较：总体而言，ChatGPT3.5 和 ChatGPT4 都表现出较高的准确性，三种回答的准确率中位数均为 92%，超过了全国消化科医生的准确率（中位数：80%），与高级消化科医生的准确率相当。与 ChatGPT3.5 相比，ChatGPT4 在相同准确率的情况下提供了更简洁的回答。ChatGPT3.5 在适应症、治疗和肠道微生物群领域表现良好，而 ChatGPT4 则在诊断、胃癌和预防以及肠道微生物群领域表现突出：除幽门螺杆菌治疗决策外，ChatGPT 在解决幽门螺杆菌相关问题方面表现出较高的准确性和可重复性，达到了资深消化科医生的水平，可作为辅助信息工具为患者和临床医生提供帮助。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Assessing Accuracy of ChatGPT on Addressing Helicobacter pylori Infection-Related Questions: A National Survey and Comparative Study

Background

ChatGPT is a novel and online large-scale language model used as a source providing up-to-date and useful health-related knowledges to patients and clinicians. However, its performance on Helicobacter pylori infection-related questions remain unknown. This study aimed to evaluate the accuracy of ChatGPT's responses on H. pylori-related questions compared with that of gastroenterologists during the same period.

Methods

Twenty-five H. pylori-related questions from five domains: Indication, Diagnostics, Treatment, Gastric cancer and prevention, and Gut Microbiota were selected based on the Maastricht VI Consensus report. Each question was tested three times with ChatGPT3.5 and ChatGPT4. Two independent H. pylori experts assessed the responses from ChatGPT, with discrepancies resolved by a third reviewer. Simultaneously, a nationwide survey with the same questions was conducted among 1279 gastroenterologists and 154 medical students. The accuracy of responses from ChatGPT3.5 and ChatGPT4 was compared with that of gastroenterologists.

Results

Overall, both ChatGPT3.5 and ChatGPT4 demonstrated high accuracy, with median accuracy rates of 92% for each of the three responses, surpassing the accuracy of nationwide gastroenterologists (median: 80%) and equivalent to that of senior gastroenterologists. Compared with ChatGPT3.5, ChatGPT4 provided more concise responses with the same accuracy. ChatGPT3.5 performed well in the Indication, Treatment, and Gut Microbiota domains, whereas ChatGPT4 excelled in Diagnostics, Gastric cancer and prevention, and Gut Microbiota domains.

Conclusion

ChatGPT exhibited high accuracy and reproducibility in addressing H. pylori-related questions except the decision for H. pylori treatment, performing at the level of senior gastroenterologists and could serve as an auxiliary information tool for assisting patients and clinicians.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Helicobacter 医学-微生物学

CiteScore

8.40

自引率

9.10%

发文量

审稿时长

2 months

期刊介绍： Helicobacter is edited by Professor David Y Graham. The editorial and peer review process is an independent process. Whenever there is a conflict of interest, the editor and editorial board will declare their interests and affiliations. Helicobacter recognises the critical role that has been established for Helicobacter pylori in peptic ulcer, gastric adenocarcinoma, and primary gastric lymphoma. As new helicobacter species are now regularly being discovered, Helicobacter covers the entire range of helicobacter research, increasing communication among the fields of gastroenterology; microbiology; vaccine development; laboratory animal science.