Diagnostic performance of advanced large language models in cystoscopy: evidence from a retrospective study and clinical cases.

IF 1.7 3区 医学 Q3 UROLOGY & NEPHROLOGY
Linfa Guo, Yingtong Zuo, Zuhaer Yisha, Jiuling Liu, Aodun Gu, Refate Yushan, Guiyong Liu, Sheng Li, Tongzu Liu, Xiaolong Wang
{"title":"Diagnostic performance of advanced large language models in cystoscopy: evidence from a retrospective study and clinical cases.","authors":"Linfa Guo, Yingtong Zuo, Zuhaer Yisha, Jiuling Liu, Aodun Gu, Refate Yushan, Guiyong Liu, Sheng Li, Tongzu Liu, Xiaolong Wang","doi":"10.1186/s12894-025-01740-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To evaluate the diagnostic capabilities of advanced large language models (LLMs) in interpreting cystoscopy images for the identification of common urological conditions.</p><p><strong>Materials and methods: </strong>A retrospective analysis was conducted on 603 cystoscopy images obtained from 101 procedures. Two advanced LLMs, both at the forefront of artificial intelligence technology, were employed to interpret these images. The diagnostic interpretations generated by these LLMs were systematically compared against standard clinical diagnostic assessments. The study's primary outcome measure was the overall diagnostic accuracy of the LLMs. Secondary outcomes focused on evaluating condition-specific accuracies across various urological conditions.</p><p><strong>Results: </strong>The combined diagnostic accuracy of both LLMs was 89.2%, with ChatGPT-4 V and Claude 3.5 Sonnet achieving accuracies of 82.8% and 79.8%, respectively. Condition-specific accuracies varied considerably, for specific urological disorders: bladder tumors (ChatGPT-4 V: 92.2%, Claude 3.5 Sonnet: 80.9%), BPH (35.3%, 32.4%), cystitis (94.5%, 98.9%), bladder diverticula (92.3%, 53.8%), and bladder trabeculae (55.8%, 59.6%). As for normal anatomical structures: ureteral orifice (ChatGPT-4 V: 48.8%, Claude 3.5 Sonnet: 61.0%), bladder neck (97.9%, 93.8%), and prostatic urethra (64.3%,57.1%).</p><p><strong>Conclusions: </strong>Advanced language models demonstrated varying levels of diagnostic accuracy in cystoscopy image interpretation, excelling in cystitis detection while showing lower accuracy for other conditions, notably benign prostatic hyperplasia. These findings suggest promising potential for LLMs as supportive tools in urological diagnosis, particularly for urologists in training or early career stages. This study underscores the need for continued research and development to optimize these AI-driven tools, with the ultimate goal of improving diagnostic accuracy and efficiency in urological practice.</p><p><strong>Clinical trial number: </strong>Not applicable.</p>","PeriodicalId":9285,"journal":{"name":"BMC Urology","volume":"25 1","pages":"64"},"PeriodicalIF":1.7000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11954320/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Urology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12894-025-01740-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"UROLOGY & NEPHROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: To evaluate the diagnostic capabilities of advanced large language models (LLMs) in interpreting cystoscopy images for the identification of common urological conditions.

Materials and methods: A retrospective analysis was conducted on 603 cystoscopy images obtained from 101 procedures. Two advanced LLMs, both at the forefront of artificial intelligence technology, were employed to interpret these images. The diagnostic interpretations generated by these LLMs were systematically compared against standard clinical diagnostic assessments. The study's primary outcome measure was the overall diagnostic accuracy of the LLMs. Secondary outcomes focused on evaluating condition-specific accuracies across various urological conditions.

Results: The combined diagnostic accuracy of both LLMs was 89.2%, with ChatGPT-4 V and Claude 3.5 Sonnet achieving accuracies of 82.8% and 79.8%, respectively. Condition-specific accuracies varied considerably, for specific urological disorders: bladder tumors (ChatGPT-4 V: 92.2%, Claude 3.5 Sonnet: 80.9%), BPH (35.3%, 32.4%), cystitis (94.5%, 98.9%), bladder diverticula (92.3%, 53.8%), and bladder trabeculae (55.8%, 59.6%). As for normal anatomical structures: ureteral orifice (ChatGPT-4 V: 48.8%, Claude 3.5 Sonnet: 61.0%), bladder neck (97.9%, 93.8%), and prostatic urethra (64.3%,57.1%).

Conclusions: Advanced language models demonstrated varying levels of diagnostic accuracy in cystoscopy image interpretation, excelling in cystitis detection while showing lower accuracy for other conditions, notably benign prostatic hyperplasia. These findings suggest promising potential for LLMs as supportive tools in urological diagnosis, particularly for urologists in training or early career stages. This study underscores the need for continued research and development to optimize these AI-driven tools, with the ultimate goal of improving diagnostic accuracy and efficiency in urological practice.

Clinical trial number: Not applicable.

求助全文
约1分钟内获得全文 求助全文
来源期刊
BMC Urology
BMC Urology UROLOGY & NEPHROLOGY-
CiteScore
3.20
自引率
0.00%
发文量
177
审稿时长
>12 weeks
期刊介绍: BMC Urology is an open access journal publishing original peer-reviewed research articles in all aspects of the prevention, diagnosis and management of urological disorders, as well as related molecular genetics, pathophysiology, and epidemiology. The journal considers manuscripts in the following broad subject-specific sections of urology: Endourology and technology Epidemiology and health outcomes Pediatric urology Pre-clinical and basic research Reconstructive urology Sexual function and fertility Urological imaging Urological oncology Voiding dysfunction Case reports.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信