Controllable retinal image synthesis using conditional StyleGAN and latent space manipulation for improved diagnosis and grading of diabetic retinopathy
Somayeh PakdelmoezDepartment of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran, Saba OmidikiaDepartment of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran, Seyyed Ali SeyyedsalehiDepartment of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran, Seyyede Zohreh SeyyedsalehiDepartment of Biomedical Engineering, Faculty of Health, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
{"title":"Controllable retinal image synthesis using conditional StyleGAN and latent space manipulation for improved diagnosis and grading of diabetic retinopathy","authors":"Somayeh PakdelmoezDepartment of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran, Saba OmidikiaDepartment of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran, Seyyed Ali SeyyedsalehiDepartment of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran, Seyyede Zohreh SeyyedsalehiDepartment of Biomedical Engineering, Faculty of Health, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran","doi":"arxiv-2409.07422","DOIUrl":null,"url":null,"abstract":"Diabetic retinopathy (DR) is a consequence of diabetes mellitus characterized\nby vascular damage within the retinal tissue. Timely detection is paramount to\nmitigate the risk of vision loss. However, training robust grading models is\nhindered by a shortage of annotated data, particularly for severe cases. This\npaper proposes a framework for controllably generating high-fidelity and\ndiverse DR fundus images, thereby improving classifier performance in DR\ngrading and detection. We achieve comprehensive control over DR severity and\nvisual features (optic disc, vessel structure, lesion areas) within generated\nimages solely through a conditional StyleGAN, eliminating the need for feature\nmasks or auxiliary networks. Specifically, leveraging the SeFa algorithm to\nidentify meaningful semantics within the latent space, we manipulate the DR\nimages generated conditionally on grades, further enhancing the dataset\ndiversity. Additionally, we propose a novel, effective SeFa-based data\naugmentation strategy, helping the classifier focus on discriminative regions\nwhile ignoring redundant features. Using this approach, a ResNet50 model\ntrained for DR detection achieves 98.09% accuracy, 99.44% specificity, 99.45%\nprecision, and an F1-score of 98.09%. Moreover, incorporating synthetic images\ngenerated by conditional StyleGAN into ResNet50 training for DR grading yields\n83.33% accuracy, a quadratic kappa score of 87.64%, 95.67% specificity, and\n72.24% precision. Extensive experiments conducted on the APTOS 2019 dataset\ndemonstrate the exceptional realism of the generated images and the superior\nperformance of our classifier compared to recent studies.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"58 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07422","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Diabetic retinopathy (DR) is a consequence of diabetes mellitus characterized
by vascular damage within the retinal tissue. Timely detection is paramount to
mitigate the risk of vision loss. However, training robust grading models is
hindered by a shortage of annotated data, particularly for severe cases. This
paper proposes a framework for controllably generating high-fidelity and
diverse DR fundus images, thereby improving classifier performance in DR
grading and detection. We achieve comprehensive control over DR severity and
visual features (optic disc, vessel structure, lesion areas) within generated
images solely through a conditional StyleGAN, eliminating the need for feature
masks or auxiliary networks. Specifically, leveraging the SeFa algorithm to
identify meaningful semantics within the latent space, we manipulate the DR
images generated conditionally on grades, further enhancing the dataset
diversity. Additionally, we propose a novel, effective SeFa-based data
augmentation strategy, helping the classifier focus on discriminative regions
while ignoring redundant features. Using this approach, a ResNet50 model
trained for DR detection achieves 98.09% accuracy, 99.44% specificity, 99.45%
precision, and an F1-score of 98.09%. Moreover, incorporating synthetic images
generated by conditional StyleGAN into ResNet50 training for DR grading yields
83.33% accuracy, a quadratic kappa score of 87.64%, 95.67% specificity, and
72.24% precision. Extensive experiments conducted on the APTOS 2019 dataset
demonstrate the exceptional realism of the generated images and the superior
performance of our classifier compared to recent studies.