{"title":"Evaluating the Performance of ChatGPT in Urology: A Comparative Study of Knowledge Interpretation and Patient Guidance.","authors":"Bahadır Şahin, Yunus Emre Genç, Kader Doğan, Tarık Emre Şener, Çağrı Akın Şekerci, Yılören Tanıdır, Selçuk Yücel, Tufan Tarcan, Haydar Kamil Çam","doi":"10.1089/end.2023.0413","DOIUrl":null,"url":null,"abstract":"<p><p><b><i>Background/Aim:</i></b> To evaluate the performance of Chat Generative Pre-trained Transformer (ChatGPT), a large language model trained by Open artificial intelligence. <b><i>Materials and Methods:</i></b> This study has three main steps to evaluate the effectiveness of ChatGPT in the urologic field. The first step involved 35 questions from our institution's experts, who have at least 10 years of experience in their fields. The responses of ChatGPT versions were qualitatively compared with the responses of urology residents to the same questions. The second step assesses the reliability of ChatGPT versions in answering current debate topics. The third step was to assess the reliability of ChatGPT versions in providing medical recommendations and directives to patients' commonly asked questions during the outpatient and inpatient clinic. <b><i>Results:</i></b> In the first step, version 4 provided correct answers to 25 questions out of 35 while version 3.5 provided only 19 (71.4% <i>vs</i> 54%). It was observed that residents in their last year of education in our clinic also provided a mean of 25 correct answers, and 4th year residents provided a mean of 19.3 correct responses. The second step involved evaluating the response of both versions to debate situations in urology, and it was found that both versions provided variable and inappropriate results. In the last step, both versions had a similar success rate in providing recommendations and guidance to patients based on expert ratings. <b><i>Conclusion:</i></b> The difference between the two versions of the 35 questions in the first step of the study was thought to be due to the improvement of ChatGPT's literature and data synthesis abilities. It may be a logical approach to use ChatGPT versions to inform the nonhealth care providers' questions with quick and safe answers but should not be used to as a diagnostic tool or make a choice among different treatment modalities.</p>","PeriodicalId":15723,"journal":{"name":"Journal of endourology","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of endourology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1089/end.2023.0413","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/30 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"UROLOGY & NEPHROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background/Aim: To evaluate the performance of Chat Generative Pre-trained Transformer (ChatGPT), a large language model trained by Open artificial intelligence. Materials and Methods: This study has three main steps to evaluate the effectiveness of ChatGPT in the urologic field. The first step involved 35 questions from our institution's experts, who have at least 10 years of experience in their fields. The responses of ChatGPT versions were qualitatively compared with the responses of urology residents to the same questions. The second step assesses the reliability of ChatGPT versions in answering current debate topics. The third step was to assess the reliability of ChatGPT versions in providing medical recommendations and directives to patients' commonly asked questions during the outpatient and inpatient clinic. Results: In the first step, version 4 provided correct answers to 25 questions out of 35 while version 3.5 provided only 19 (71.4% vs 54%). It was observed that residents in their last year of education in our clinic also provided a mean of 25 correct answers, and 4th year residents provided a mean of 19.3 correct responses. The second step involved evaluating the response of both versions to debate situations in urology, and it was found that both versions provided variable and inappropriate results. In the last step, both versions had a similar success rate in providing recommendations and guidance to patients based on expert ratings. Conclusion: The difference between the two versions of the 35 questions in the first step of the study was thought to be due to the improvement of ChatGPT's literature and data synthesis abilities. It may be a logical approach to use ChatGPT versions to inform the nonhealth care providers' questions with quick and safe answers but should not be used to as a diagnostic tool or make a choice among different treatment modalities.
期刊介绍:
Journal of Endourology, JE Case Reports, and Videourology are the leading peer-reviewed journal, case reports publication, and innovative videojournal companion covering all aspects of minimally invasive urology research, applications, and clinical outcomes.
The leading journal of minimally invasive urology for over 30 years, Journal of Endourology is the essential publication for practicing surgeons who want to keep up with the latest surgical technologies in endoscopic, laparoscopic, robotic, and image-guided procedures as they apply to benign and malignant diseases of the genitourinary tract. This flagship journal includes the companion videojournal Videourology™ with every subscription. While Journal of Endourology remains focused on publishing rigorously peer reviewed articles, Videourology accepts original videos containing material that has not been reported elsewhere, except in the form of an abstract or a conference presentation.
Journal of Endourology coverage includes:
The latest laparoscopic, robotic, endoscopic, and image-guided techniques for treating both benign and malignant conditions
Pioneering research articles
Controversial cases in endourology
Techniques in endourology with accompanying videos
Reviews and epochs in endourology
Endourology survey section of endourology relevant manuscripts published in other journals.