Lightweight image super-resolution with sliding Proxy Attention Network

IF 3.4 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Zhenyu Hu, Wanjie Sun, Zhenzhong Chen
{"title":"Lightweight image super-resolution with sliding Proxy Attention Network","authors":"Zhenyu Hu,&nbsp;Wanjie Sun,&nbsp;Zhenzhong Chen","doi":"10.1016/j.sigpro.2024.109704","DOIUrl":null,"url":null,"abstract":"<div><p>Recently, image super-resolution (SR) models using window-based Transformers have demonstrated superior performance compared to SR models based on convolutional neural networks. Nevertheless, Transformer-based SR models often entail high computational demands. This is due to the adoption of shifted window self-attention following the window self-attention layer to model long-range relationships, resulting in additional computational overhead. Moreover, extracting local image features only with the self-attention mechanism is insufficient to reconstruct rich high-frequency image content. To overcome these challenges, we propose the Sliding Proxy Attention Network (SPAN), capable of recovering high-quality High-Resolution (HR) images from Low-Resolution (LR) inputs with substantially fewer model parameters and computational operations. The primary innovation of SPAN lies in the Sliding Proxy Transformer Block (SPTB), integrating the local detail sensitivity of convolution with the long-range dependency modeling of self-attention mechanism. Key components within SPTB include the Enhanced Local Feature Extraction Block (ELFEB) and the Sliding Proxy Attention Block (SPAB). ELFEB is designed to enhance the local receptive field with lightweight parameters for high-frequency details compensation. SPAB optimizes computational efficiency by implementing intra-window and cross-window attention in a single operation through leveraging window overlap. Experimental results demonstrate that SPAN can produce high-quality SR images while effectively managing computational complexity. The code is publicly available at: <span><span>https://github.com/zononhzy/SPAN</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"227 ","pages":"Article 109704"},"PeriodicalIF":3.4000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165168424003244","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Recently, image super-resolution (SR) models using window-based Transformers have demonstrated superior performance compared to SR models based on convolutional neural networks. Nevertheless, Transformer-based SR models often entail high computational demands. This is due to the adoption of shifted window self-attention following the window self-attention layer to model long-range relationships, resulting in additional computational overhead. Moreover, extracting local image features only with the self-attention mechanism is insufficient to reconstruct rich high-frequency image content. To overcome these challenges, we propose the Sliding Proxy Attention Network (SPAN), capable of recovering high-quality High-Resolution (HR) images from Low-Resolution (LR) inputs with substantially fewer model parameters and computational operations. The primary innovation of SPAN lies in the Sliding Proxy Transformer Block (SPTB), integrating the local detail sensitivity of convolution with the long-range dependency modeling of self-attention mechanism. Key components within SPTB include the Enhanced Local Feature Extraction Block (ELFEB) and the Sliding Proxy Attention Block (SPAB). ELFEB is designed to enhance the local receptive field with lightweight parameters for high-frequency details compensation. SPAB optimizes computational efficiency by implementing intra-window and cross-window attention in a single operation through leveraging window overlap. Experimental results demonstrate that SPAN can produce high-quality SR images while effectively managing computational complexity. The code is publicly available at: https://github.com/zononhzy/SPAN.

利用滑动代理注意力网络实现轻量级图像超分辨率
最近,与基于卷积神经网络的超分辨率(SR)模型相比,使用基于窗口的变换器的图像超分辨率(SR)模型表现出了更优越的性能。然而,基于变换器的 SR 模型往往需要很高的计算要求。这是由于在窗口自注意力层之后采用了移位窗口自注意力来模拟长距离关系,从而导致额外的计算开销。此外,仅利用自注意机制提取局部图像特征不足以重建丰富的高频图像内容。为了克服这些挑战,我们提出了滑动代理注意力网络(SPAN),它能够从低分辨率(LR)输入中恢复高质量的高分辨率(HR)图像,同时大大减少模型参数和计算操作。SPAN 的主要创新点在于滑动代理变换块(SPTB),它将卷积的局部细节灵敏度与自我注意机制的长程依赖性建模融为一体。SPTB 的关键组件包括增强型局部特征提取块 (ELFEB) 和滑动代理注意力块 (SPAB)。ELFEB 的设计目的是利用轻量级参数增强局部感受野,以补偿高频细节。SPAB 通过利用窗口重叠,在一次操作中实现窗内和跨窗注意,从而优化了计算效率。实验结果表明,SPAN 可以生成高质量的 SR 图像,同时有效控制计算复杂度。代码可在以下网址公开获取:https://github.com/zononhzy/SPAN。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Signal Processing
Signal Processing 工程技术-工程:电子与电气
CiteScore
9.20
自引率
9.10%
发文量
309
审稿时长
41 days
期刊介绍: Signal Processing incorporates all aspects of the theory and practice of signal processing. It features original research work, tutorial and review articles, and accounts of practical developments. It is intended for a rapid dissemination of knowledge and experience to engineers and scientists working in the research, development or practical application of signal processing. Subject areas covered by the journal include: Signal Theory; Stochastic Processes; Detection and Estimation; Spectral Analysis; Filtering; Signal Processing Systems; Software Developments; Image Processing; Pattern Recognition; Optical Signal Processing; Digital Signal Processing; Multi-dimensional Signal Processing; Communication Signal Processing; Biomedical Signal Processing; Geophysical and Astrophysical Signal Processing; Earth Resources Signal Processing; Acoustic and Vibration Signal Processing; Data Processing; Remote Sensing; Signal Processing Technology; Radar Signal Processing; Sonar Signal Processing; Industrial Applications; New Applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信