基于大型语言模型的多模态系统，用于从智能手机图像中检测和分级眼表疾病。

IF 4.6 2区生物学 Q2 CELL BIOLOGY

Frontiers in Cell and Developmental Biology Pub Date : 2025-05-23 eCollection Date: 2025-01-01 DOI:10.3389/fcell.2025.1600202

Zhongwen Li, Zhouqian Wang, Liheng Xiu, Pengyao Zhang, Wenfang Wang, Yangyang Wang, Gang Chen, Weihua Yang, Wei Chen

{"title":"基于大型语言模型的多模态系统，用于从智能手机图像中检测和分级眼表疾病。","authors":"Zhongwen Li, Zhouqian Wang, Liheng Xiu, Pengyao Zhang, Wenfang Wang, Yangyang Wang, Gang Chen, Weihua Yang, Wei Chen","doi":"10.3389/fcell.2025.1600202","DOIUrl":null,"url":null,"abstract":"Background: The development of medical artificial intelligence (AI) models is primarily driven by the need to address healthcare resource scarcity, particularly in underserved regions. Proposing an affordable, accessible, interpretable, and automated AI system for non-clinical settings is crucial to expanding access to quality healthcare.Methods: This cross-sectional study developed the Multimodal Ocular Surface Assessment and Interpretation Copilot (MOSAIC) using three multimodal large language models: gpt-4-turbo, claude-3-opus, and gemini-1.5-pro-latest, for detecting three ocular surface diseases (OSDs) and grading keratitis and pterygium. A total of 375 smartphone-captured ocular surface images collected from 290 eyes were utilized to validate MOSAIC. The performance of MOSAIC was evaluated in both zero-shot and few-shot settings, with tasks including image quality control, OSD detection, analysis of the severity of keratitis, and pterygium grading. The interpretability of the system was also evaluated.Results: MOSAIC achieved 95.00% accuracy in image quality control, 86.96% in OSD detection, 88.33% in distinguishing mild from severe keratitis, and 66.67% in determining pterygium grades with five-shot settings. The performance significantly improved with the increasing learning shots (p < 0.01). The system attained high ROUGE-L F1 scores of 0.70-0.78, depicting its interpretable image comprehension capability.Conclusion: MOSAIC exhibited exceptional few-shot learning capabilities, achieving high accuracy in OSD management with minimal training examples. This system has significant potential for smartphone integration to enhance the accessibility and effectiveness of OSD detection and grading in resource-limited settings.","PeriodicalId":12448,"journal":{"name":"Frontiers in Cell and Developmental Biology","volume":"13 ","pages":"1600202"},"PeriodicalIF":4.6000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12141289/pdf/","citationCount":"0","resultStr":"{\"title\":\"Large language model-based multimodal system for detecting and grading ocular surface diseases from smartphone images.\",\"authors\":\"Zhongwen Li, Zhouqian Wang, Liheng Xiu, Pengyao Zhang, Wenfang Wang, Yangyang Wang, Gang Chen, Weihua Yang, Wei Chen\",\"doi\":\"10.3389/fcell.2025.1600202\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: The development of medical artificial intelligence (AI) models is primarily driven by the need to address healthcare resource scarcity, particularly in underserved regions. Proposing an affordable, accessible, interpretable, and automated AI system for non-clinical settings is crucial to expanding access to quality healthcare.Methods: This cross-sectional study developed the Multimodal Ocular Surface Assessment and Interpretation Copilot (MOSAIC) using three multimodal large language models: gpt-4-turbo, claude-3-opus, and gemini-1.5-pro-latest, for detecting three ocular surface diseases (OSDs) and grading keratitis and pterygium. A total of 375 smartphone-captured ocular surface images collected from 290 eyes were utilized to validate MOSAIC. The performance of MOSAIC was evaluated in both zero-shot and few-shot settings, with tasks including image quality control, OSD detection, analysis of the severity of keratitis, and pterygium grading. The interpretability of the system was also evaluated.Results: MOSAIC achieved 95.00% accuracy in image quality control, 86.96% in OSD detection, 88.33% in distinguishing mild from severe keratitis, and 66.67% in determining pterygium grades with five-shot settings. The performance significantly improved with the increasing learning shots (p < 0.01). The system attained high ROUGE-L F1 scores of 0.70-0.78, depicting its interpretable image comprehension capability.Conclusion: MOSAIC exhibited exceptional few-shot learning capabilities, achieving high accuracy in OSD management with minimal training examples. This system has significant potential for smartphone integration to enhance the accessibility and effectiveness of OSD detection and grading in resource-limited settings.\",\"PeriodicalId\":12448,\"journal\":{\"name\":\"Frontiers in Cell and Developmental Biology\",\"volume\":\"13 \",\"pages\":\"1600202\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12141289/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Cell and Developmental Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.3389/fcell.2025.1600202\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"CELL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Cell and Developmental Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.3389/fcell.2025.1600202","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"CELL BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

背景：医疗人工智能（AI）模型的发展主要是由解决医疗资源短缺的需求驱动的，特别是在服务不足的地区。为非临床环境提出负担得起、可访问、可解释和自动化的人工智能系统对于扩大获得高质量医疗保健至关重要。方法：本横断面研究采用gpt-4-turbo、claude-3-opus和gemini-1.5-pro-latest三种多模态大语言模型开发了多模态眼表评估和解释副驾驶仪（MOSAIC），用于检测三种眼表疾病（OSDs）并对角膜炎和翼状胬气进行分级。从290只眼睛中采集了375张智能手机拍摄的眼表图像，用于验证MOSAIC。在零镜头和少镜头设置下评估MOSAIC的性能，任务包括图像质量控制，OSD检测，角膜炎严重程度分析和翼状胬肉分级。对系统的可解释性也进行了评价。结果：MOSAIC在图像质量控制方面的准确率为95.00%，在OSD检测方面的准确率为86.96%，在区分轻度和重度角膜炎方面的准确率为88.33%，在确定五次设置的翼状胬肉分级方面的准确率为66.67%。随着学习次数的增加，生产性能显著提高（p < 0.01）。该系统获得了0.70-0.78的ROUGE-L F1高分，表明其具有可解释的图像理解能力。结论：MOSAIC表现出优异的少镜头学习能力，以最少的训练样本实现了高准确率的OSD管理。该系统具有与智能手机集成的巨大潜力，可以在资源有限的情况下提高OSD检测和分级的可及性和有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Large language model-based multimodal system for detecting and grading ocular surface diseases from smartphone images.

Background: The development of medical artificial intelligence (AI) models is primarily driven by the need to address healthcare resource scarcity, particularly in underserved regions. Proposing an affordable, accessible, interpretable, and automated AI system for non-clinical settings is crucial to expanding access to quality healthcare.

Methods: This cross-sectional study developed the Multimodal Ocular Surface Assessment and Interpretation Copilot (MOSAIC) using three multimodal large language models: gpt-4-turbo, claude-3-opus, and gemini-1.5-pro-latest, for detecting three ocular surface diseases (OSDs) and grading keratitis and pterygium. A total of 375 smartphone-captured ocular surface images collected from 290 eyes were utilized to validate MOSAIC. The performance of MOSAIC was evaluated in both zero-shot and few-shot settings, with tasks including image quality control, OSD detection, analysis of the severity of keratitis, and pterygium grading. The interpretability of the system was also evaluated.

Results: MOSAIC achieved 95.00% accuracy in image quality control, 86.96% in OSD detection, 88.33% in distinguishing mild from severe keratitis, and 66.67% in determining pterygium grades with five-shot settings. The performance significantly improved with the increasing learning shots (p < 0.01). The system attained high ROUGE-L F1 scores of 0.70-0.78, depicting its interpretable image comprehension capability.

Conclusion: MOSAIC exhibited exceptional few-shot learning capabilities, achieving high accuracy in OSD management with minimal training examples. This system has significant potential for smartphone integration to enhance the accessibility and effectiveness of OSD detection and grading in resource-limited settings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers in Cell and Developmental Biology Biochemistry, Genetics and Molecular Biology-Cell Biology

CiteScore

9.70

自引率

3.60%

发文量

2531

审稿时长

12 weeks

期刊介绍： Frontiers in Cell and Developmental Biology is a broad-scope, interdisciplinary open-access journal, focusing on the fundamental processes of life, led by Prof Amanda Fisher and supported by a geographically diverse, high-quality editorial board. The journal welcomes submissions on a wide spectrum of cell and developmental biology, covering intracellular and extracellular dynamics, with sections focusing on signaling, adhesion, migration, cell death and survival and membrane trafficking. Additionally, the journal offers sections dedicated to the cutting edge of fundamental and translational research in molecular medicine and stem cell biology. With a collaborative, rigorous and transparent peer-review, the journal produces the highest scientific quality in both fundamental and applied research, and advanced article level metrics measure the real-time impact and influence of each publication.