Xuanzhao Dong, Vamsi Krishna Vasa, Wenhui Zhu, Peijie Qiu, Xiwen Chen, Yi Su, Yujian Xiong, Zhangsihao Yang, Yanxi Chen, Yalin Wang
{"title":"CUNSB-RFIE: Context-aware Unpaired Neural Schr\"{o}dinger Bridge in Retinal Fundus Image Enhancement","authors":"Xuanzhao Dong, Vamsi Krishna Vasa, Wenhui Zhu, Peijie Qiu, Xiwen Chen, Yi Su, Yujian Xiong, Zhangsihao Yang, Yanxi Chen, Yalin Wang","doi":"arxiv-2409.10966","DOIUrl":null,"url":null,"abstract":"Retinal fundus photography is significant in diagnosing and monitoring\nretinal diseases. However, systemic imperfections and operator/patient-related\nfactors can hinder the acquisition of high-quality retinal images. Previous\nefforts in retinal image enhancement primarily relied on GANs, which are\nlimited by the trade-off between training stability and output diversity. In\ncontrast, the Schr\\\"{o}dinger Bridge (SB), offers a more stable solution by\nutilizing Optimal Transport (OT) theory to model a stochastic differential\nequation (SDE) between two arbitrary distributions. This allows SB to\neffectively transform low-quality retinal images into their high-quality\ncounterparts. In this work, we leverage the SB framework to propose an\nimage-to-image translation pipeline for retinal image enhancement.\nAdditionally, previous methods often fail to capture fine structural details,\nsuch as blood vessels. To address this, we enhance our pipeline by introducing\nDynamic Snake Convolution, whose tortuous receptive field can better preserve\ntubular structures. We name the resulting retinal fundus image enhancement\nframework the Context-aware Unpaired Neural Schr\\\"{o}dinger Bridge\n(CUNSB-RFIE). To the best of our knowledge, this is the first endeavor to use\nthe SB approach for retinal image enhancement. Experimental results on a\nlarge-scale dataset demonstrate the advantage of the proposed method compared\nto several state-of-the-art supervised and unsupervised methods in terms of\nimage quality and performance on downstream tasks.The code is available at\n\\url{https://github.com/Retinal-Research/CUNSB-RFIE}.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10966","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Retinal fundus photography is significant in diagnosing and monitoring
retinal diseases. However, systemic imperfections and operator/patient-related
factors can hinder the acquisition of high-quality retinal images. Previous
efforts in retinal image enhancement primarily relied on GANs, which are
limited by the trade-off between training stability and output diversity. In
contrast, the Schr\"{o}dinger Bridge (SB), offers a more stable solution by
utilizing Optimal Transport (OT) theory to model a stochastic differential
equation (SDE) between two arbitrary distributions. This allows SB to
effectively transform low-quality retinal images into their high-quality
counterparts. In this work, we leverage the SB framework to propose an
image-to-image translation pipeline for retinal image enhancement.
Additionally, previous methods often fail to capture fine structural details,
such as blood vessels. To address this, we enhance our pipeline by introducing
Dynamic Snake Convolution, whose tortuous receptive field can better preserve
tubular structures. We name the resulting retinal fundus image enhancement
framework the Context-aware Unpaired Neural Schr\"{o}dinger Bridge
(CUNSB-RFIE). To the best of our knowledge, this is the first endeavor to use
the SB approach for retinal image enhancement. Experimental results on a
large-scale dataset demonstrate the advantage of the proposed method compared
to several state-of-the-art supervised and unsupervised methods in terms of
image quality and performance on downstream tasks.The code is available at
\url{https://github.com/Retinal-Research/CUNSB-RFIE}.