{"title":"WFDENet: Wavelet-based frequency decomposition and enhancement network for diabetic retinopathy lesion segmentation","authors":"Xuan Li, Ding Ma, Xiangqian Wu","doi":"10.1016/j.patcog.2025.112492","DOIUrl":null,"url":null,"abstract":"<div><div>The acquisition of precise semantic and detailed information is indispensable for high-accuracy diabetic retinopathy lesion segmentation (DRLS). To achieve this, noticing that high- and low-level encoder features respectively contain rich semantics and details, most existing DRLS methods focus on the design of delicate multi-level feature refinement and fusion manners. However, they ignore the exploration of intrinsic low- and high-frequency information of multi-level features, which can also describe the semantics and details. To fill this gap, we propose a Wavelet-based Frequency Decomposition and Enhancement Network (WFDENet), which simultaneously refines semantic and detailed representations by enhancing the low- and high-frequency components of the multi-level encoder features. Specifically, the low- and high-frequency components, which are acquired via discrete wavelet transform (DWT), are boosted by a low-frequency booster (LFB) and a high-frequency booster (HFB), respectively. High-frequency components contain abundant details but also more noise. To suppress the noise and strengthen critical features, in HFB, we devise a complex convolutional frequency attention module (CCFAM), which utilizes complex convolutions to generate dynamic complex-valued channel and spatial attention to improve the Fourier spectrum of high-frequency components. Moreover, considering the importance of multi-scale information, we aggregate the multi-scale frequency features to enrich the frequency components in both LFB and HFB. Experimental results on IDRiD and DDR datasets show that our WFDENet outperforms state-of-the-art methods. The source code is available at <span><span>https://github.com/xuanli01/WFDENet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112492"},"PeriodicalIF":7.6000,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325011550","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The acquisition of precise semantic and detailed information is indispensable for high-accuracy diabetic retinopathy lesion segmentation (DRLS). To achieve this, noticing that high- and low-level encoder features respectively contain rich semantics and details, most existing DRLS methods focus on the design of delicate multi-level feature refinement and fusion manners. However, they ignore the exploration of intrinsic low- and high-frequency information of multi-level features, which can also describe the semantics and details. To fill this gap, we propose a Wavelet-based Frequency Decomposition and Enhancement Network (WFDENet), which simultaneously refines semantic and detailed representations by enhancing the low- and high-frequency components of the multi-level encoder features. Specifically, the low- and high-frequency components, which are acquired via discrete wavelet transform (DWT), are boosted by a low-frequency booster (LFB) and a high-frequency booster (HFB), respectively. High-frequency components contain abundant details but also more noise. To suppress the noise and strengthen critical features, in HFB, we devise a complex convolutional frequency attention module (CCFAM), which utilizes complex convolutions to generate dynamic complex-valued channel and spatial attention to improve the Fourier spectrum of high-frequency components. Moreover, considering the importance of multi-scale information, we aggregate the multi-scale frequency features to enrich the frequency components in both LFB and HFB. Experimental results on IDRiD and DDR datasets show that our WFDENet outperforms state-of-the-art methods. The source code is available at https://github.com/xuanli01/WFDENet.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.