Qi Zhang;Yuwei Ding;Weiqi Zhang;Yian Zhu;Bob Zhang;Jerry Chun-Wei Lin
{"title":"隐式多尺度Swin变压器网络图像去噪","authors":"Qi Zhang;Yuwei Ding;Weiqi Zhang;Yian Zhu;Bob Zhang;Jerry Chun-Wei Lin","doi":"10.1109/TCE.2025.3565390","DOIUrl":null,"url":null,"abstract":"Image denoising has been used in various edge computing scenarios such as consumer electronics to improve the image quality and user experience. Existing image denoising methods based on Convolutional Neural Networks (CNNs) and vision Transformers achieve good performance by empirically utilizing residual neural network (ResNet) as basic component. However, ResNet lacks interpretability in network design and also long-term memory of features, potentially restricting the performance of denoising networks. In this paper, we propose an Implicit Multi-scale Swin Transformer Network (IMSNet) for image denoising, which introduces the implicit Euler scheme from the feature memory perspective. Specifically, densely connected implicit feature extraction blocks (IFEBs) are designed to learn the residual mapping between noisy and clean images. The IFEB reformulates the initial skip connection of ResNet based on the implicit Euler discretization, providing both network interpretability and long-term memory. In IFEB, the multi-scale swin Transformer block (MSB) is designed as the implicit layer to capture spatial details and non-local contextual information at different scales. Additionally, a cross-layer feature fusion block (CLFF) is proposed to further improve feature reuse capabilities. Compared to existing denoising networks, the extensive experiments demonstrate the superior performance of our IMSNet in various image denoising tasks, and the flexibility in the practical applications with proposed light model on resource-restricted platforms such as consumer electronic devices.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"71 2","pages":"5584-5594"},"PeriodicalIF":10.9000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Implicit Multi-Scale Swin Transformer Network for Image Denoising\",\"authors\":\"Qi Zhang;Yuwei Ding;Weiqi Zhang;Yian Zhu;Bob Zhang;Jerry Chun-Wei Lin\",\"doi\":\"10.1109/TCE.2025.3565390\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image denoising has been used in various edge computing scenarios such as consumer electronics to improve the image quality and user experience. Existing image denoising methods based on Convolutional Neural Networks (CNNs) and vision Transformers achieve good performance by empirically utilizing residual neural network (ResNet) as basic component. However, ResNet lacks interpretability in network design and also long-term memory of features, potentially restricting the performance of denoising networks. In this paper, we propose an Implicit Multi-scale Swin Transformer Network (IMSNet) for image denoising, which introduces the implicit Euler scheme from the feature memory perspective. Specifically, densely connected implicit feature extraction blocks (IFEBs) are designed to learn the residual mapping between noisy and clean images. The IFEB reformulates the initial skip connection of ResNet based on the implicit Euler discretization, providing both network interpretability and long-term memory. In IFEB, the multi-scale swin Transformer block (MSB) is designed as the implicit layer to capture spatial details and non-local contextual information at different scales. Additionally, a cross-layer feature fusion block (CLFF) is proposed to further improve feature reuse capabilities. Compared to existing denoising networks, the extensive experiments demonstrate the superior performance of our IMSNet in various image denoising tasks, and the flexibility in the practical applications with proposed light model on resource-restricted platforms such as consumer electronic devices.\",\"PeriodicalId\":13208,\"journal\":{\"name\":\"IEEE Transactions on Consumer Electronics\",\"volume\":\"71 2\",\"pages\":\"5584-5594\"},\"PeriodicalIF\":10.9000,\"publicationDate\":\"2025-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Consumer Electronics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10979956/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10979956/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Implicit Multi-Scale Swin Transformer Network for Image Denoising
Image denoising has been used in various edge computing scenarios such as consumer electronics to improve the image quality and user experience. Existing image denoising methods based on Convolutional Neural Networks (CNNs) and vision Transformers achieve good performance by empirically utilizing residual neural network (ResNet) as basic component. However, ResNet lacks interpretability in network design and also long-term memory of features, potentially restricting the performance of denoising networks. In this paper, we propose an Implicit Multi-scale Swin Transformer Network (IMSNet) for image denoising, which introduces the implicit Euler scheme from the feature memory perspective. Specifically, densely connected implicit feature extraction blocks (IFEBs) are designed to learn the residual mapping between noisy and clean images. The IFEB reformulates the initial skip connection of ResNet based on the implicit Euler discretization, providing both network interpretability and long-term memory. In IFEB, the multi-scale swin Transformer block (MSB) is designed as the implicit layer to capture spatial details and non-local contextual information at different scales. Additionally, a cross-layer feature fusion block (CLFF) is proposed to further improve feature reuse capabilities. Compared to existing denoising networks, the extensive experiments demonstrate the superior performance of our IMSNet in various image denoising tasks, and the flexibility in the practical applications with proposed light model on resource-restricted platforms such as consumer electronic devices.
期刊介绍:
The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.