通过OCT进行视网膜疾病诊断的多模态llm:少次学习与单次学习。

IF 2.3 Q2 OPHTHALMOLOGY
Therapeutic Advances in Ophthalmology Pub Date : 2025-05-20 eCollection Date: 2025-01-01 DOI:10.1177/25158414251340569
Reem Agbareia, Mahmud Omar, Ofira Zloto, Benjamin S Glicksberg, Girish N Nadkarni, Eyal Klang
{"title":"通过OCT进行视网膜疾病诊断的多模态llm:少次学习与单次学习。","authors":"Reem Agbareia, Mahmud Omar, Ofira Zloto, Benjamin S Glicksberg, Girish N Nadkarni, Eyal Klang","doi":"10.1177/25158414251340569","DOIUrl":null,"url":null,"abstract":"<p><strong>Background and aim: </strong>Multimodal large language models (LLMs) have shown potential in processing both text and image data for clinical applications. This study evaluated their diagnostic performance in identifying retinal diseases from optical coherence tomography (OCT) images.</p><p><strong>Methods: </strong>We assessed the diagnostic accuracy of GPT-4o and Claude Sonnet 3.5 using two public OCT datasets (OCTID, OCTDL) containing expert-labeled images of four pathological conditions and normal retinas. Both models were tested using single-shot and few-shot prompts, with an overall of 3088 models' API calls. Statistical analyses were performed to evaluate differences in overall and condition-specific performance.</p><p><strong>Results: </strong>GPT-4o's accuracy improved from 56.29% with single-shot prompts to 73.08% with few-shot prompts (<i>p</i> < 0.001). Similarly, Claude Sonnet 3.5 increased from 40.03% to 70.98% using the same approach (<i>p</i> < 0.001). Condition-specific analyses revealed similar trends, with absolute improvements ranging from 2% to 64%. These findings were consistent across the validation dataset.</p><p><strong>Conclusion: </strong>Few-shot prompted multimodal LLMs show promise for clinical integration, particularly in identifying normal retinas, which could help streamline referral processes in primary care. While these models fall short of the diagnostic accuracy reported in established deep learning literature, they offer simple, effective tools for assisting in routine retinal disease diagnosis. Future research should focus on further validation and integrating clinical text data with imaging.</p>","PeriodicalId":23054,"journal":{"name":"Therapeutic Advances in Ophthalmology","volume":"17 ","pages":"25158414251340569"},"PeriodicalIF":2.3000,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12093016/pdf/","citationCount":"0","resultStr":"{\"title\":\"Multimodal LLMs for retinal disease diagnosis via OCT: few-shot versus single-shot learning.\",\"authors\":\"Reem Agbareia, Mahmud Omar, Ofira Zloto, Benjamin S Glicksberg, Girish N Nadkarni, Eyal Klang\",\"doi\":\"10.1177/25158414251340569\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background and aim: </strong>Multimodal large language models (LLMs) have shown potential in processing both text and image data for clinical applications. This study evaluated their diagnostic performance in identifying retinal diseases from optical coherence tomography (OCT) images.</p><p><strong>Methods: </strong>We assessed the diagnostic accuracy of GPT-4o and Claude Sonnet 3.5 using two public OCT datasets (OCTID, OCTDL) containing expert-labeled images of four pathological conditions and normal retinas. Both models were tested using single-shot and few-shot prompts, with an overall of 3088 models' API calls. Statistical analyses were performed to evaluate differences in overall and condition-specific performance.</p><p><strong>Results: </strong>GPT-4o's accuracy improved from 56.29% with single-shot prompts to 73.08% with few-shot prompts (<i>p</i> < 0.001). Similarly, Claude Sonnet 3.5 increased from 40.03% to 70.98% using the same approach (<i>p</i> < 0.001). Condition-specific analyses revealed similar trends, with absolute improvements ranging from 2% to 64%. These findings were consistent across the validation dataset.</p><p><strong>Conclusion: </strong>Few-shot prompted multimodal LLMs show promise for clinical integration, particularly in identifying normal retinas, which could help streamline referral processes in primary care. While these models fall short of the diagnostic accuracy reported in established deep learning literature, they offer simple, effective tools for assisting in routine retinal disease diagnosis. Future research should focus on further validation and integrating clinical text data with imaging.</p>\",\"PeriodicalId\":23054,\"journal\":{\"name\":\"Therapeutic Advances in Ophthalmology\",\"volume\":\"17 \",\"pages\":\"25158414251340569\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12093016/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Therapeutic Advances in Ophthalmology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1177/25158414251340569\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Therapeutic Advances in Ophthalmology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/25158414251340569","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景和目的:多模态大语言模型(LLMs)在临床应用中显示出处理文本和图像数据的潜力。本研究评估了它们在光学相干断层扫描(OCT)图像中识别视网膜疾病的诊断性能。方法:我们使用两个公共OCT数据集(OCTID, OCTDL)评估gpt - 40和Claude Sonnet 3.5的诊断准确性,这些数据集包含专家标记的四种病理状态和正常视网膜的图像。这两个模型都使用单次和几次提示进行测试,总共有3088个模型的API调用。进行统计分析以评估总体和特定条件下性能的差异。结果:gpt - 40的准确率从单次提示的56.29%提高到少次提示的73.08% (p)结论:少次提示的多模式LLMs在临床整合方面有希望,特别是在识别正常视网膜方面,可以帮助简化初级保健的转诊流程。虽然这些模型的诊断准确性低于已建立的深度学习文献,但它们为辅助常规视网膜疾病诊断提供了简单有效的工具。未来的研究应集中在进一步验证和整合临床文本数据与影像学。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Multimodal LLMs for retinal disease diagnosis via OCT: few-shot versus single-shot learning.

Multimodal LLMs for retinal disease diagnosis via OCT: few-shot versus single-shot learning.

Multimodal LLMs for retinal disease diagnosis via OCT: few-shot versus single-shot learning.

Multimodal LLMs for retinal disease diagnosis via OCT: few-shot versus single-shot learning.

Background and aim: Multimodal large language models (LLMs) have shown potential in processing both text and image data for clinical applications. This study evaluated their diagnostic performance in identifying retinal diseases from optical coherence tomography (OCT) images.

Methods: We assessed the diagnostic accuracy of GPT-4o and Claude Sonnet 3.5 using two public OCT datasets (OCTID, OCTDL) containing expert-labeled images of four pathological conditions and normal retinas. Both models were tested using single-shot and few-shot prompts, with an overall of 3088 models' API calls. Statistical analyses were performed to evaluate differences in overall and condition-specific performance.

Results: GPT-4o's accuracy improved from 56.29% with single-shot prompts to 73.08% with few-shot prompts (p < 0.001). Similarly, Claude Sonnet 3.5 increased from 40.03% to 70.98% using the same approach (p < 0.001). Condition-specific analyses revealed similar trends, with absolute improvements ranging from 2% to 64%. These findings were consistent across the validation dataset.

Conclusion: Few-shot prompted multimodal LLMs show promise for clinical integration, particularly in identifying normal retinas, which could help streamline referral processes in primary care. While these models fall short of the diagnostic accuracy reported in established deep learning literature, they offer simple, effective tools for assisting in routine retinal disease diagnosis. Future research should focus on further validation and integrating clinical text data with imaging.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.50
自引率
0.00%
发文量
44
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信