J.J. Szczesniewski , A. Ramos Alba , P.M. Rodríguez Castro , M.F. Lorenzo Gómez , J. Sainz González , L. Llanes González
{"title":"Quality of information about urologic pathology in English and Spanish from ChatGPT, BARD, and Copilot","authors":"J.J. Szczesniewski , A. Ramos Alba , P.M. Rodríguez Castro , M.F. Lorenzo Gómez , J. Sainz González , L. Llanes González","doi":"10.1016/j.acuroe.2024.02.009","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction and objective</h3><p>Generative artificial intelligence makes it possible to ask about medical pathologies in dialog boxes. Our objective was to analyze the quality of information about the most common urological pathologies provided by ChatGPT (OpenIA), BARD (Google), and Copilot (Microsoft).</p></div><div><h3>Methods</h3><p>We analyzed information on the following pathologies and their treatments as provided by AI: prostate cancer, kidney cancer, bladder cancer, urinary lithiasis, and benign prostatic hypertrophy (BPH). Questions in English and Spanish were posed in dialog boxes; the answers were collected and analyzed with DISCERN questionnaires and the overall appropriateness of the response. Surgical procedures were performed with an informed consent questionnaire.</p></div><div><h3>Results</h3><p>The responses from the three chatbots explained the pathology, detailed risk factors, and described treatments. The difference is that BARD and Copilot provide external information citations, which ChatGPT does not. The highest DISCERN scores, in absolute numbers, were obtained in Copilot; however, on the appropriacy scale it was noted that their responses were not the most appropriate. The best surgical treatment scores were obtained by BARD, followed by ChatGPT, and finally Copilot.</p></div><div><h3>Conclusions</h3><p>The answers obtained from generative AI on urological diseases depended on the formulation of the question. The information provided had significant biases, depending on pathology, language, and above all, the dialog box consulted.</p></div>","PeriodicalId":94291,"journal":{"name":"Actas urologicas espanolas","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Actas urologicas espanolas","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2173578624000167","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction and objective
Generative artificial intelligence makes it possible to ask about medical pathologies in dialog boxes. Our objective was to analyze the quality of information about the most common urological pathologies provided by ChatGPT (OpenIA), BARD (Google), and Copilot (Microsoft).
Methods
We analyzed information on the following pathologies and their treatments as provided by AI: prostate cancer, kidney cancer, bladder cancer, urinary lithiasis, and benign prostatic hypertrophy (BPH). Questions in English and Spanish were posed in dialog boxes; the answers were collected and analyzed with DISCERN questionnaires and the overall appropriateness of the response. Surgical procedures were performed with an informed consent questionnaire.
Results
The responses from the three chatbots explained the pathology, detailed risk factors, and described treatments. The difference is that BARD and Copilot provide external information citations, which ChatGPT does not. The highest DISCERN scores, in absolute numbers, were obtained in Copilot; however, on the appropriacy scale it was noted that their responses were not the most appropriate. The best surgical treatment scores were obtained by BARD, followed by ChatGPT, and finally Copilot.
Conclusions
The answers obtained from generative AI on urological diseases depended on the formulation of the question. The information provided had significant biases, depending on pathology, language, and above all, the dialog box consulted.