GlaucoDiff: A Framework for Generating Balanced Glaucoma Fundus Images and Improving Diagnostic Performance

IF 2.5 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology Pub Date : 2025-08-18 DOI:10.1002/ima.70185

Caisheng Liao, Yuki Todo, Jiashu Zhang, Zheng Tang

{"title":"GlaucoDiff: A Framework for Generating Balanced Glaucoma Fundus Images and Improving Diagnostic Performance","authors":"Caisheng Liao, Yuki Todo, Jiashu Zhang, Zheng Tang","doi":"10.1002/ima.70185","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Glaucoma is a leading cause of irreversible blindness, and early diagnosis is critical. While retinal fundus images are commonly used for screening, AI-based diagnostic models face challenges such as data scarcity, class imbalance, and limited image diversity. To address this, we introduce GlaucoDiff, a diffusion-based image synthesis framework designed to generate clinically meaningful glaucoma fundus images. It employs a two-stage training strategy and integrates a multimodal large language model as an automated quality filter to ensure clinical relevance. Experiments on the JustRAIGS dataset show that GlaucoDiff outperforms commercial generators such as DALL-E 3 and Keling, achieving better image quality and diversity (FID: 109.8; SWD: 222.2). When synthetic images were used to augment the training set of a vision transformer classifier, sensitivity improved consistently from 0.8182 with only real data to 0.8615 with 10% synthetic images, and further to 0.8788 with 50%. However, as the proportion of synthetic data increased, other important metrics such as specificity, accuracy, and AUC began to decline compared to the results with 10% synthetic data. This finding suggests that although more synthetic images can enhance the model's ability to detect positive cases, too much synthetic data may reduce overall classification performance. These results demonstrate the practical value of GlaucoDiff in alleviating data imbalance and improving diagnostic accuracy for AI-assisted glaucoma screening.</p>\n </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 5","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Imaging Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ima.70185","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Glaucoma is a leading cause of irreversible blindness, and early diagnosis is critical. While retinal fundus images are commonly used for screening, AI-based diagnostic models face challenges such as data scarcity, class imbalance, and limited image diversity. To address this, we introduce GlaucoDiff, a diffusion-based image synthesis framework designed to generate clinically meaningful glaucoma fundus images. It employs a two-stage training strategy and integrates a multimodal large language model as an automated quality filter to ensure clinical relevance. Experiments on the JustRAIGS dataset show that GlaucoDiff outperforms commercial generators such as DALL-E 3 and Keling, achieving better image quality and diversity (FID: 109.8; SWD: 222.2). When synthetic images were used to augment the training set of a vision transformer classifier, sensitivity improved consistently from 0.8182 with only real data to 0.8615 with 10% synthetic images, and further to 0.8788 with 50%. However, as the proportion of synthetic data increased, other important metrics such as specificity, accuracy, and AUC began to decline compared to the results with 10% synthetic data. This finding suggests that although more synthetic images can enhance the model's ability to detect positive cases, too much synthetic data may reduce overall classification performance. These results demonstrate the practical value of GlaucoDiff in alleviating data imbalance and improving diagnostic accuracy for AI-assisted glaucoma screening.

查看原文本刊更多论文

GlaucoDiff：生成平衡的青光眼眼底图像和提高诊断性能的框架

青光眼是不可逆失明的主要原因，早期诊断至关重要。虽然视网膜眼底图像通常用于筛查，但基于人工智能的诊断模型面临着数据稀缺、类别不平衡和图像多样性有限等挑战。为了解决这个问题，我们引入了GlaucoDiff，这是一个基于弥散的图像合成框架，旨在生成具有临床意义的青光眼眼底图像。它采用两阶段训练策略，并集成了多模态大语言模型作为自动质量过滤器，以确保临床相关性。在JustRAIGS数据集上的实验表明，GlaucoDiff优于商用生成器，如DALL-E 3和Keling，实现了更好的图像质量和多样性(FID: 109.8；社署:222.2)。当使用合成图像增强视觉变换分类器的训练集时，灵敏度从仅真实数据时的0.8182持续提高到10%合成图像时的0.8615，再到50%合成图像时的0.8788。然而，随着合成数据比例的增加，与10%合成数据的结果相比，特异性、准确性和AUC等其他重要指标开始下降。这一发现表明，虽然更多的合成图像可以增强模型检测阳性病例的能力，但过多的合成数据可能会降低整体分类性能。这些结果证明了GlaucoDiff在缓解数据不平衡和提高人工智能辅助青光眼筛查诊断准确性方面的实用价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Imaging Systems and Technology 工程技术-成像科学与照相技术

CiteScore

6.90

自引率

6.10%

发文量

138

审稿时长

3 months

期刊介绍： The International Journal of Imaging Systems and Technology (IMA) is a forum for the exchange of ideas and results relevant to imaging systems, including imaging physics and informatics. The journal covers all imaging modalities in humans and animals. IMA accepts technically sound and scientifically rigorous research in the interdisciplinary field of imaging, including relevant algorithmic research and hardware and software development, and their applications relevant to medical research. The journal provides a platform to publish original research in structural and functional imaging. The journal is also open to imaging studies of the human body and on animals that describe novel diagnostic imaging and analyses methods. Technical, theoretical, and clinical research in both normal and clinical populations is encouraged. Submissions describing methods, software, databases, replication studies as well as negative results are also considered. The scope of the journal includes, but is not limited to, the following in the context of biomedical research: Imaging and neuro-imaging modalities: structural MRI, functional MRI, PET, SPECT, CT, ultrasound, EEG, MEG, NIRS etc.; Neuromodulation and brain stimulation techniques such as TMS and tDCS; Software and hardware for imaging, especially related to human and animal health; Image segmentation in normal and clinical populations; Pattern analysis and classification using machine learning techniques; Computational modeling and analysis; Brain connectivity and connectomics; Systems-level characterization of brain function; Neural networks and neurorobotics; Computer vision, based on human/animal physiology; Brain-computer interface (BCI) technology; Big data, databasing and data mining.