A novel single-stage network for accurate image restoration

The Visual Computer Pub Date : 2024-08-26 DOI:10.1007/s00371-024-03599-6

Hu Gao, Jing Yang, Ying Zhang, Ning Wang, Jingfan Yang, Depeng Dang

{"title":"A novel single-stage network for accurate image restoration","authors":"Hu Gao, Jing Yang, Ying Zhang, Ning Wang, Jingfan Yang, Depeng Dang","doi":"10.1007/s00371-024-03599-6","DOIUrl":null,"url":null,"abstract":"<p>Image restoration is the task of aiming to obtain a high-quality image from a corrupt input image, such as deblurring and deraining. In image restoration, it is typically necessary to maintain a complex balance between spatial details and contextual information. Although a multi-stage network can optimally balance these competing goals and achieve significant performance, this also increases the system’s complexity. In this paper, we propose a mountain-shaped single-stage design, which achieves the performance of multi-stage networks through a plug-and-play feature fusion middleware. Specifically, we propose a plug-and-play feature fusion middleware mechanism as an information exchange component between the encoder-decoder architectural levels. It seamlessly integrates upper-layer information into the adjacent lower layer, sequentially down to the lowest layer. Finally, all information is fused into the original image resolution manipulation level. This preserves spatial details and integrates contextual information, ensuring high-quality image restoration. Simultaneously, we propose a multi-head attention middle block as a bridge between the encoder and decoder to capture more global information and surpass the limitations of the receptive field of CNNs. In order to achieve low system complexity, we removes or replaces unnecessary nonlinear activation functions. Extensive experiments demonstrate that our approach, named as M3SNet, outperforms previous state-of-the-art models while using less than half the computational costs, for several image restoration tasks, such as image deraining and deblurring. The code and the pre-trained models will be released at https://github.com/Tombs98/M3SNet.</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"35 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03599-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Image restoration is the task of aiming to obtain a high-quality image from a corrupt input image, such as deblurring and deraining. In image restoration, it is typically necessary to maintain a complex balance between spatial details and contextual information. Although a multi-stage network can optimally balance these competing goals and achieve significant performance, this also increases the system’s complexity. In this paper, we propose a mountain-shaped single-stage design, which achieves the performance of multi-stage networks through a plug-and-play feature fusion middleware. Specifically, we propose a plug-and-play feature fusion middleware mechanism as an information exchange component between the encoder-decoder architectural levels. It seamlessly integrates upper-layer information into the adjacent lower layer, sequentially down to the lowest layer. Finally, all information is fused into the original image resolution manipulation level. This preserves spatial details and integrates contextual information, ensuring high-quality image restoration. Simultaneously, we propose a multi-head attention middle block as a bridge between the encoder and decoder to capture more global information and surpass the limitations of the receptive field of CNNs. In order to achieve low system complexity, we removes or replaces unnecessary nonlinear activation functions. Extensive experiments demonstrate that our approach, named as M3SNet, outperforms previous state-of-the-art models while using less than half the computational costs, for several image restoration tasks, such as image deraining and deblurring. The code and the pre-trained models will be released at https://github.com/Tombs98/M3SNet.

Abstract Image

查看原文本刊更多论文

用于精确图像复原的新型单级网络

图像复原是一项旨在从损坏的输入图像中获取高质量图像的任务，例如去模糊和去毛刺。在图像复原中，通常需要在空间细节和上下文信息之间保持复杂的平衡。虽然多级网络可以优化平衡这些相互竞争的目标，并取得显著的性能，但这也增加了系统的复杂性。在本文中，我们提出了一种山形单级设计，通过即插即用的特征融合中间件实现多级网络的性能。具体来说，我们提出了一种即插即用的特征融合中间件机制，作为编码器-解码器架构层之间的信息交换组件。它能将上层信息无缝整合到相邻的下层，并依次向下整合到最底层。最后，所有信息都被融合到原始图像分辨率处理层。这样既保留了空间细节，又整合了上下文信息，确保了高质量的图像复原。同时，我们还提出了多头注意力中间块，作为编码器和解码器之间的桥梁，以捕捉更多的全局信息，超越 CNN 感受场的限制。为了降低系统复杂性，我们删除或替换了不必要的非线性激活函数。广泛的实验证明，我们的方法（命名为 M3SNet）在图像去毛刺和去模糊等多项图像修复任务中的表现优于之前的先进模型，而计算成本却不到其一半。代码和预训练模型将在 https://github.com/Tombs98/M3SNet 上发布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The Visual Computer

自引率

0.00%

发文量