Assessing the Performance of Chat Generative Pretrained Transformer (ChatGPT) in Answering Andrology-Related Questions.

IF 1.1 0 UROLOGY & NEPHROLOGY

Urology research & practice Pub Date : 2023-11-01 DOI:10.5152/tud.2023.23171

Ufuk Caglar, Oguzhan Yildiz, M Fırat Ozervarli, Resat Aydin, Omer Sarilar, Faruk Ozgor, Mazhar Ortac

{"title":"Assessing the Performance of Chat Generative Pretrained Transformer (ChatGPT) in Answering Andrology-Related Questions.","authors":"Ufuk Caglar, Oguzhan Yildiz, M Fırat Ozervarli, Resat Aydin, Omer Sarilar, Faruk Ozgor, Mazhar Ortac","doi":"10.5152/tud.2023.23171","DOIUrl":null,"url":null,"abstract":"Objective: The internet and social media have become primary sources of health information, with men frequently turning to these platforms before seeking professional help. Chat generative pretrained transformer (ChatGPT), an artificial intelligence model developed by OpenAI, has gained popularity as a natural language processing program. The present study evaluated the accuracy and reproducibility of ChatGPT's responses to andrology-related questions.Methods: The study analyzed frequently asked andrology questions from health forums, hospital websites, and social media platforms like YouTube and Instagram. Questions were categorized into topics like male hypogonadism, erectile dysfunction, etc. The European Association of Urology (EAU) guideline recommendations were also included. These questions were input into ChatGPT, and responses were evaluated by 3 experienced urologists who scored them on a scale of 1 to 4.Results: Out of 136 evaluated questions, 108 met the criteria. Of these, 87.9% received correct and adequate answers, 9.3% were correct but insufficient, and 3 responses contained both correct and incorrect information. No question was answered completely wrong. The highest correct answer rates were for disorders of ejaculation, penile curvature, and male hypogonadism. The EAU guideline-based questions achieved a correctness rate of 86.3%. The reproducibility of the answers was over 90%.Conclusion: The study found that ChatGPT provided accurate and reliable answers to over 80% of andrology-related questions. While limitations exist, such as potential outdated data and inability to understand emotional aspects, ChatGPT's potential in the health-care sector is promising. Collaborating with health-care professionals during artificial intelligence model development could enhance its reliability.","PeriodicalId":101337,"journal":{"name":"Urology research & practice","volume":" ","pages":"365-369"},"PeriodicalIF":1.1000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10765186/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Urology research & practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5152/tud.2023.23171","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"UROLOGY & NEPHROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: The internet and social media have become primary sources of health information, with men frequently turning to these platforms before seeking professional help. Chat generative pretrained transformer (ChatGPT), an artificial intelligence model developed by OpenAI, has gained popularity as a natural language processing program. The present study evaluated the accuracy and reproducibility of ChatGPT's responses to andrology-related questions.

Methods: The study analyzed frequently asked andrology questions from health forums, hospital websites, and social media platforms like YouTube and Instagram. Questions were categorized into topics like male hypogonadism, erectile dysfunction, etc. The European Association of Urology (EAU) guideline recommendations were also included. These questions were input into ChatGPT, and responses were evaluated by 3 experienced urologists who scored them on a scale of 1 to 4.

Results: Out of 136 evaluated questions, 108 met the criteria. Of these, 87.9% received correct and adequate answers, 9.3% were correct but insufficient, and 3 responses contained both correct and incorrect information. No question was answered completely wrong. The highest correct answer rates were for disorders of ejaculation, penile curvature, and male hypogonadism. The EAU guideline-based questions achieved a correctness rate of 86.3%. The reproducibility of the answers was over 90%.

Conclusion: The study found that ChatGPT provided accurate and reliable answers to over 80% of andrology-related questions. While limitations exist, such as potential outdated data and inability to understand emotional aspects, ChatGPT's potential in the health-care sector is promising. Collaborating with health-care professionals during artificial intelligence model development could enhance its reliability.

Abstract Image

查看原文本刊更多论文

评估聊天生成预训练转换器（ChatGPT）在回答男科相关问题时的性能。

目的：互联网和社交媒体已成为健康信息的主要来源，男性在寻求专业帮助之前经常求助于这些平台。聊天生成预训练转换器（ChatGPT）是由OpenAI开发的一种人工智能模型，作为一种自然语言处理程序而广受欢迎。本研究评估了ChatGPT对男科相关问题的回答的准确性和再现性。方法：该研究分析了来自健康论坛、医院网站以及YouTube和Instagram等社交媒体平台的男科常见问题。问题分为男性性腺功能减退症、勃起功能障碍等主题。欧洲泌尿外科协会（EAU）的指南建议也包括在内。这些问题被输入到ChatGPT中，由3名经验丰富的泌尿科医生对回答进行评估，他们对这些问题进行了1-4分的评分。结果：在136个评估问题中，108个符合标准。其中，87.9%的回答正确且充分，9.3%的回答正确但不充分，3份回复同时包含正确和不正确的信息。没有一个问题的答案是完全错误的。正确答案率最高的是射精障碍、阴茎弯曲和男性性腺功能减退。基于EAU指南的问题的正确率为86.3%。答案的可重复性超过90%。结论：研究发现，ChatGPT为超过80%的男科相关问题提供了准确可靠的答案。尽管存在局限性，如潜在的过时数据和无法理解情绪方面，但ChatGPT在医疗保健领域的潜力是有希望的。在人工智能模型开发过程中与医疗保健专业人员合作可以提高其可靠性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Urology research & practice

CiteScore

2.60

自引率

0.00%

发文量