评估 ChatGPT 在泌尿外科中的表现:知识解释与患者指导的比较研究。

IF 2.9 2区 医学 Q1 UROLOGY & NEPHROLOGY
Journal of endourology Pub Date : 2024-08-01 Epub Date: 2024-05-30 DOI:10.1089/end.2023.0413
Bahadır Şahin, Yunus Emre Genç, Kader Doğan, Tarık Emre Şener, Çağrı Akın Şekerci, Yılören Tanıdır, Selçuk Yücel, Tufan Tarcan, Haydar Kamil Çam
{"title":"评估 ChatGPT 在泌尿外科中的表现:知识解释与患者指导的比较研究。","authors":"Bahadır Şahin, Yunus Emre Genç, Kader Doğan, Tarık Emre Şener, Çağrı Akın Şekerci, Yılören Tanıdır, Selçuk Yücel, Tufan Tarcan, Haydar Kamil Çam","doi":"10.1089/end.2023.0413","DOIUrl":null,"url":null,"abstract":"<p><p><b><i>Background/Aim:</i></b> To evaluate the performance of Chat Generative Pre-trained Transformer (ChatGPT), a large language model trained by Open artificial intelligence. <b><i>Materials and Methods:</i></b> This study has three main steps to evaluate the effectiveness of ChatGPT in the urologic field. The first step involved 35 questions from our institution's experts, who have at least 10 years of experience in their fields. The responses of ChatGPT versions were qualitatively compared with the responses of urology residents to the same questions. The second step assesses the reliability of ChatGPT versions in answering current debate topics. The third step was to assess the reliability of ChatGPT versions in providing medical recommendations and directives to patients' commonly asked questions during the outpatient and inpatient clinic. <b><i>Results:</i></b> In the first step, version 4 provided correct answers to 25 questions out of 35 while version 3.5 provided only 19 (71.4% <i>vs</i> 54%). It was observed that residents in their last year of education in our clinic also provided a mean of 25 correct answers, and 4th year residents provided a mean of 19.3 correct responses. The second step involved evaluating the response of both versions to debate situations in urology, and it was found that both versions provided variable and inappropriate results. In the last step, both versions had a similar success rate in providing recommendations and guidance to patients based on expert ratings. <b><i>Conclusion:</i></b> The difference between the two versions of the 35 questions in the first step of the study was thought to be due to the improvement of ChatGPT's literature and data synthesis abilities. It may be a logical approach to use ChatGPT versions to inform the nonhealth care providers' questions with quick and safe answers but should not be used to as a diagnostic tool or make a choice among different treatment modalities.</p>","PeriodicalId":15723,"journal":{"name":"Journal of endourology","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating the Performance of ChatGPT in Urology: A Comparative Study of Knowledge Interpretation and Patient Guidance.\",\"authors\":\"Bahadır Şahin, Yunus Emre Genç, Kader Doğan, Tarık Emre Şener, Çağrı Akın Şekerci, Yılören Tanıdır, Selçuk Yücel, Tufan Tarcan, Haydar Kamil Çam\",\"doi\":\"10.1089/end.2023.0413\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b><i>Background/Aim:</i></b> To evaluate the performance of Chat Generative Pre-trained Transformer (ChatGPT), a large language model trained by Open artificial intelligence. <b><i>Materials and Methods:</i></b> This study has three main steps to evaluate the effectiveness of ChatGPT in the urologic field. The first step involved 35 questions from our institution's experts, who have at least 10 years of experience in their fields. The responses of ChatGPT versions were qualitatively compared with the responses of urology residents to the same questions. The second step assesses the reliability of ChatGPT versions in answering current debate topics. The third step was to assess the reliability of ChatGPT versions in providing medical recommendations and directives to patients' commonly asked questions during the outpatient and inpatient clinic. <b><i>Results:</i></b> In the first step, version 4 provided correct answers to 25 questions out of 35 while version 3.5 provided only 19 (71.4% <i>vs</i> 54%). It was observed that residents in their last year of education in our clinic also provided a mean of 25 correct answers, and 4th year residents provided a mean of 19.3 correct responses. The second step involved evaluating the response of both versions to debate situations in urology, and it was found that both versions provided variable and inappropriate results. In the last step, both versions had a similar success rate in providing recommendations and guidance to patients based on expert ratings. <b><i>Conclusion:</i></b> The difference between the two versions of the 35 questions in the first step of the study was thought to be due to the improvement of ChatGPT's literature and data synthesis abilities. It may be a logical approach to use ChatGPT versions to inform the nonhealth care providers' questions with quick and safe answers but should not be used to as a diagnostic tool or make a choice among different treatment modalities.</p>\",\"PeriodicalId\":15723,\"journal\":{\"name\":\"Journal of endourology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of endourology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1089/end.2023.0413\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/5/30 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"UROLOGY & NEPHROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of endourology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1089/end.2023.0413","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/30 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"UROLOGY & NEPHROLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景/目的:评估由开放人工智能训练的大型语言模型--聊天生成预训练转换器(ChatGPT)的性能。材料与方法:本研究通过三个主要步骤来评估 ChatGPT 在泌尿科领域的有效性。第一步是由我们机构的专家提出 35 个问题,这些专家在各自领域至少有 10 年的经验。ChatGPT 版本的回答与泌尿科住院医师对相同问题的回答进行了定性比较。第二步是评估 ChatGPT 版本在回答当前辩论话题时的可靠性。第三步是评估 ChatGPT 版本在门诊和住院期间针对患者常见问题提供医疗建议和指示的可靠性。结果:第一步,在 35 个问题中,第 4 版提供了 25 个问题的正确答案,而第 3.5 版仅提供了 19 个问题的正确答案(71.4% 对 54%)。据观察,在本诊所接受最后一年教育的住院医师也平均提供了 25 个正确答案,而四年级住院医师平均提供了 19.3 个正确答案。第二步是评估两个版本对泌尿科辩论情况的反应,结果发现两个版本都提供了不同的、不恰当的结果。最后一步,两个版本在根据专家评分向患者提供建议和指导方面的成功率相似。结论研究第一步中两个版本 35 个问题之间的差异被认为是由于 ChatGPT 文献和数据综合能力的提高。使用 ChatGPT 版本为非医疗服务提供者的问题提供快速、安全的答案可能是一种合理的方法,但不应被用作诊断工具或在不同的治疗方式中做出选择。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating the Performance of ChatGPT in Urology: A Comparative Study of Knowledge Interpretation and Patient Guidance.

Background/Aim: To evaluate the performance of Chat Generative Pre-trained Transformer (ChatGPT), a large language model trained by Open artificial intelligence. Materials and Methods: This study has three main steps to evaluate the effectiveness of ChatGPT in the urologic field. The first step involved 35 questions from our institution's experts, who have at least 10 years of experience in their fields. The responses of ChatGPT versions were qualitatively compared with the responses of urology residents to the same questions. The second step assesses the reliability of ChatGPT versions in answering current debate topics. The third step was to assess the reliability of ChatGPT versions in providing medical recommendations and directives to patients' commonly asked questions during the outpatient and inpatient clinic. Results: In the first step, version 4 provided correct answers to 25 questions out of 35 while version 3.5 provided only 19 (71.4% vs 54%). It was observed that residents in their last year of education in our clinic also provided a mean of 25 correct answers, and 4th year residents provided a mean of 19.3 correct responses. The second step involved evaluating the response of both versions to debate situations in urology, and it was found that both versions provided variable and inappropriate results. In the last step, both versions had a similar success rate in providing recommendations and guidance to patients based on expert ratings. Conclusion: The difference between the two versions of the 35 questions in the first step of the study was thought to be due to the improvement of ChatGPT's literature and data synthesis abilities. It may be a logical approach to use ChatGPT versions to inform the nonhealth care providers' questions with quick and safe answers but should not be used to as a diagnostic tool or make a choice among different treatment modalities.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of endourology
Journal of endourology 医学-泌尿学与肾脏学
CiteScore
5.50
自引率
14.80%
发文量
254
审稿时长
1 months
期刊介绍: Journal of Endourology, JE Case Reports, and Videourology are the leading peer-reviewed journal, case reports publication, and innovative videojournal companion covering all aspects of minimally invasive urology research, applications, and clinical outcomes. The leading journal of minimally invasive urology for over 30 years, Journal of Endourology is the essential publication for practicing surgeons who want to keep up with the latest surgical technologies in endoscopic, laparoscopic, robotic, and image-guided procedures as they apply to benign and malignant diseases of the genitourinary tract. This flagship journal includes the companion videojournal Videourology™ with every subscription. While Journal of Endourology remains focused on publishing rigorously peer reviewed articles, Videourology accepts original videos containing material that has not been reported elsewhere, except in the form of an abstract or a conference presentation. Journal of Endourology coverage includes: The latest laparoscopic, robotic, endoscopic, and image-guided techniques for treating both benign and malignant conditions Pioneering research articles Controversial cases in endourology Techniques in endourology with accompanying videos Reviews and epochs in endourology Endourology survey section of endourology relevant manuscripts published in other journals.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信