Exploring the capacities of ChatGPT: A comprehensive evaluation of its accuracy and repeatability in addressing helicobacter pylori-related queries

IF 4.3 2区医学 Q1 GASTROENTEROLOGY & HEPATOLOGY

Helicobacter Pub Date : 2024-06-13 DOI:10.1111/hel.13078

Yongkang Lai, Foqiang Liao, Jiulong Zhao, Chunping Zhu, Yi Hu, Zhaoshen Li

{"title":"Exploring the capacities of ChatGPT: A comprehensive evaluation of its accuracy and repeatability in addressing helicobacter pylori-related queries","authors":"Yongkang Lai, Foqiang Liao, Jiulong Zhao, Chunping Zhu, Yi Hu, Zhaoshen Li","doi":"10.1111/hel.13078","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Educational initiatives on <i>Helicobacter pylori</i> (<i>H. pylori</i>) constitute a highly effective approach for preventing its infection and establishing standardized protocols for its eradication. ChatGPT, a large language model, is a potentially patient-friendly online tool capable of providing health-related knowledge. This study aims to assess the accuracy and repeatability of ChatGPT in responding to questions related to <i>H. pylori.</i></p>\n </section>\n \n <section>\n \n <h3> Materials and Methods</h3>\n \n <p>Twenty-one common questions about <i>H. pylori</i> were collected and categorized into four domains: basic knowledge, diagnosis, treatment, and prevention. ChatGPT was utilized to individually answer the aforementioned 21 questions. Its responses were independently assessed by two experts on <i>H. pylori</i>. Questions with divergent ratings were resolved by a third reviewer. Cohen's kappa coefficient was calculated to assess the consistency between the scores of the two reviewers.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The responses of ChatGPT on <i>H. pylori</i>-related questions were generally satisfactory, with 61.9% marked as “completely correct” and 33.33% as “correct but inadequate.” The repeatability of the responses of ChatGPT to <i>H. pylori</i>-related questions was 95.23%. Among the responses, those related to prevention (comprehensive: 75%) had the best response, followed by those on treatment (comprehensive: 66.7%), basic knowledge (comprehensive: 60%), and diagnosis (comprehensive: 50%). In the “treatment” domain, 16.6% of the ChatGPT responses were categorized as “mixed with correct or incorrect/outdated data.” However, ChatGPT still lacks relevant knowledge regarding <i>H. pylori</i> resistance and the use of sensitive antibiotics.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>ChatGPT can provide correct answers to the majority of <i>H. pylori</i>-related queries. It exhibited good reproducibility and delivered responses that were easily comprehensible to patients. Further enhancement of real-time information updates and correction of inaccurate information will make ChatGPT an essential auxiliary tool for providing accurate <i>H. pylori</i>-related health information to patients.</p>\n </section>\n </div>","PeriodicalId":13223,"journal":{"name":"Helicobacter","volume":"29 3","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Helicobacter","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/hel.13078","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Educational initiatives on Helicobacter pylori (H. pylori) constitute a highly effective approach for preventing its infection and establishing standardized protocols for its eradication. ChatGPT, a large language model, is a potentially patient-friendly online tool capable of providing health-related knowledge. This study aims to assess the accuracy and repeatability of ChatGPT in responding to questions related to H. pylori.

Materials and Methods

Twenty-one common questions about H. pylori were collected and categorized into four domains: basic knowledge, diagnosis, treatment, and prevention. ChatGPT was utilized to individually answer the aforementioned 21 questions. Its responses were independently assessed by two experts on H. pylori. Questions with divergent ratings were resolved by a third reviewer. Cohen's kappa coefficient was calculated to assess the consistency between the scores of the two reviewers.

Results

The responses of ChatGPT on H. pylori-related questions were generally satisfactory, with 61.9% marked as “completely correct” and 33.33% as “correct but inadequate.” The repeatability of the responses of ChatGPT to H. pylori-related questions was 95.23%. Among the responses, those related to prevention (comprehensive: 75%) had the best response, followed by those on treatment (comprehensive: 66.7%), basic knowledge (comprehensive: 60%), and diagnosis (comprehensive: 50%). In the “treatment” domain, 16.6% of the ChatGPT responses were categorized as “mixed with correct or incorrect/outdated data.” However, ChatGPT still lacks relevant knowledge regarding H. pylori resistance and the use of sensitive antibiotics.

Conclusions

ChatGPT can provide correct answers to the majority of H. pylori-related queries. It exhibited good reproducibility and delivered responses that were easily comprehensible to patients. Further enhancement of real-time information updates and correction of inaccurate information will make ChatGPT an essential auxiliary tool for providing accurate H. pylori-related health information to patients.

查看原文本刊更多论文

探索 ChatGPT 的能力：全面评估其处理幽门螺旋杆菌相关查询的准确性和可重复性。

背景：关于幽门螺杆菌（H. pylori）的教育活动是预防幽门螺杆菌感染和建立根除幽门螺杆菌标准化方案的一种非常有效的方法。ChatGPT 是一种大型语言模型，是一种潜在的患者友好型在线工具，能够提供与健康相关的知识。本研究旨在评估 ChatGPT 在回答幽门螺杆菌相关问题时的准确性和可重复性：收集了 21 个有关幽门螺杆菌的常见问题，并将其分为四个领域：基础知识、诊断、治疗和预防。使用 ChatGPT 单独回答上述 21 个问题。其回答由两位幽门螺杆菌专家进行独立评估。评分有分歧的问题由第三位评审员解决。科恩卡帕系数（Cohen's kappa coefficient）用于评估两位评审员评分的一致性：结果：ChatGPT 对幽门螺杆菌相关问题的回答基本令人满意，61.9% 的回答被评为 "完全正确"，33.33% 的回答被评为 "正确但不充分"。ChatGPT 对幽门螺杆菌相关问题回答的重复率为 95.23%。在这些回答中，与预防相关的回答（全面：75%）最好，其次是与治疗相关的回答（全面：66.7%）、基本知识（全面：60%）和诊断（全面：50%）。在 "治疗 "领域，16.6% 的 ChatGPT 回答被归类为 "正确或不正确/过时数据混杂"。然而，ChatGPT 仍然缺乏幽门螺杆菌耐药性和敏感抗生素使用方面的相关知识：结论：ChatGPT 可以为大多数幽门螺杆菌相关查询提供正确答案。结论：ChatGPT 可以提供大多数幽门螺杆菌相关询问的正确答案，具有良好的可重复性，提供的回答也易于患者理解。进一步加强实时信息更新和纠正不准确信息将使 ChatGPT 成为向患者提供准确的幽门螺杆菌相关健康信息的重要辅助工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Helicobacter 医学-微生物学

CiteScore

8.40

自引率

9.10%

发文量

审稿时长

2 months

期刊介绍： Helicobacter is edited by Professor David Y Graham. The editorial and peer review process is an independent process. Whenever there is a conflict of interest, the editor and editorial board will declare their interests and affiliations. Helicobacter recognises the critical role that has been established for Helicobacter pylori in peptic ulcer, gastric adenocarcinoma, and primary gastric lymphoma. As new helicobacter species are now regularly being discovered, Helicobacter covers the entire range of helicobacter research, increasing communication among the fields of gastroenterology; microbiology; vaccine development; laboratory animal science.