Addressing Commonly Asked Questions in Urogynecology: Accuracy and Limitations of ChatGPT.

IF 1.8 3区 医学 Q3 OBSTETRICS & GYNECOLOGY
Gregory Vurture, Nicole Jenkins, James Ross, Stephanie Sansone, Ellen Conner, Nina Jacobson, Scott Smilen, Jonathan Baum
{"title":"Addressing Commonly Asked Questions in Urogynecology: Accuracy and Limitations of ChatGPT.","authors":"Gregory Vurture, Nicole Jenkins, James Ross, Stephanie Sansone, Ellen Conner, Nina Jacobson, Scott Smilen, Jonathan Baum","doi":"10.1007/s00192-025-06184-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction and hypothesis: </strong>Existing literature suggests that large language models such as Chat Generative Pre-training Transformer (ChatGPT) might provide inaccurate and unreliable health care information. The literature regarding its performance in urogynecology is scarce. The aim of the present study is to assess ChatGPT's ability to accurately answer commonly asked urogynecology patient questions.</p><p><strong>Methods: </strong>An expert panel of five board certified urogynecologists and two fellows developed ten commonly asked patient questions in a urogynecology office. Questions were phrased using diction and verbiage that a patient may use when asking a question over the internet. ChatGPT responses were evaluated using the Brief DISCERN (BD) tool, a validated scoring system for online health care information. Scores ≥ 16 are consistent with good-quality content. Responses were graded based on their accuracy and consistency with expert opinion and published guidelines.</p><p><strong>Results: </strong>The average score across all ten questions was 18.9 ± 2.7. Nine out of ten (90%) questions had a response that was determined to be of good quality (BD ≥ 16). The lowest scoring topic was \"Pelvic Organ Prolapse\" (mean BD = 14.0 ± 2.0). The highest scoring topic was \"Interstitial Cystitis\" (mean BD = 22.0 ± 0). ChatGPT provided no references for its responses.</p><p><strong>Conclusions: </strong>ChatGPT provided high-quality responses to 90% of the questions based on an expert panel's review with the BD tool. Nonetheless, given the evolving nature of this technology, continued analysis is crucial before ChatGPT can be accepted as accurate and reliable.</p>","PeriodicalId":14355,"journal":{"name":"International Urogynecology Journal","volume":" ","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Urogynecology Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00192-025-06184-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction and hypothesis: Existing literature suggests that large language models such as Chat Generative Pre-training Transformer (ChatGPT) might provide inaccurate and unreliable health care information. The literature regarding its performance in urogynecology is scarce. The aim of the present study is to assess ChatGPT's ability to accurately answer commonly asked urogynecology patient questions.

Methods: An expert panel of five board certified urogynecologists and two fellows developed ten commonly asked patient questions in a urogynecology office. Questions were phrased using diction and verbiage that a patient may use when asking a question over the internet. ChatGPT responses were evaluated using the Brief DISCERN (BD) tool, a validated scoring system for online health care information. Scores ≥ 16 are consistent with good-quality content. Responses were graded based on their accuracy and consistency with expert opinion and published guidelines.

Results: The average score across all ten questions was 18.9 ± 2.7. Nine out of ten (90%) questions had a response that was determined to be of good quality (BD ≥ 16). The lowest scoring topic was "Pelvic Organ Prolapse" (mean BD = 14.0 ± 2.0). The highest scoring topic was "Interstitial Cystitis" (mean BD = 22.0 ± 0). ChatGPT provided no references for its responses.

Conclusions: ChatGPT provided high-quality responses to 90% of the questions based on an expert panel's review with the BD tool. Nonetheless, given the evolving nature of this technology, continued analysis is crucial before ChatGPT can be accepted as accurate and reliable.

解决泌尿妇科常见问题:ChatGPT的准确性和局限性。
引言和假设:现有文献表明,大型语言模型,如聊天生成预训练转换器(ChatGPT)可能提供不准确和不可靠的医疗保健信息。关于其在泌尿妇科的表现的文献很少。本研究的目的是评估ChatGPT准确回答泌尿妇科患者常见问题的能力。方法:由五名委员会认证的泌尿妇科医生和两名研究员组成的专家小组在泌尿妇科办公室制定了十个常见的患者问题。问题的措辞和措辞与病人在互联网上提问时使用的措辞和措辞相同。ChatGPT的回答使用Brief DISCERN (BD)工具进行评估,这是一种有效的在线医疗保健信息评分系统。评分≥16分为内容质量好。根据回答的准确性和与专家意见和已发表指南的一致性对其进行评分。结果:10个问题的平均得分为18.9±2.7分。10个问题中有9个(90%)的回答被确定为质量良好(BD≥16)。评分最低的话题是“盆腔器官脱垂”(平均BD = 14.0±2.0)。评分最高的是间质性膀胱炎(平均BD = 22.0±0)。ChatGPT没有为其响应提供引用。结论:ChatGPT在专家小组使用BD工具进行审查的基础上,对90%的问题提供了高质量的回答。尽管如此,考虑到这项技术不断发展的本质,在ChatGPT被认为是准确可靠的之前,持续的分析是至关重要的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.80
自引率
22.20%
发文量
406
审稿时长
3-6 weeks
期刊介绍: The International Urogynecology Journal is the official journal of the International Urogynecological Association (IUGA).The International Urogynecology Journal has evolved in response to a perceived need amongst the clinicians, scientists, and researchers active in the field of urogynecology and pelvic floor disorders. Gynecologists, urologists, physiotherapists, nurses and basic scientists require regular means of communication within this field of pelvic floor dysfunction to express new ideas and research, and to review clinical practice in the diagnosis and treatment of women with disorders of the pelvic floor. This Journal has adopted the peer review process for all original contributions and will maintain high standards with regard to the research published therein. The clinical approach to urogynecology and pelvic floor disorders will be emphasized with each issue containing clinically relevant material that will be immediately applicable for clinical medicine. This publication covers all aspects of the field in an interdisciplinary fashion
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信