用于图像超分辨率的空间松弛变换器

IF 5.2 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of King Saud University-Computer and Information Sciences Pub Date : 2024-08-06 DOI:10.1016/j.jksuci.2024.102150

Yinghua Li , Ying Zhang , Hao Zeng , Jinglu He , Jie Guo

{"title":"用于图像超分辨率的空间松弛变换器","authors":"Yinghua Li , Ying Zhang , Hao Zeng , Jinglu He , Jie Guo","doi":"10.1016/j.jksuci.2024.102150","DOIUrl":null,"url":null,"abstract":"<div><p>Transformer-based approaches have demonstrated remarkable performance in image processing tasks due to their ability to model long-range dependencies. Current mainstream Transformer-based methods typically confine self-attention computation within windows to reduce computational burden. However, this constraint may lead to grid artifacts in the reconstructed images due to insufficient cross-window information exchange, particularly in image super-resolution tasks. To address this issue, we propose the Multi-Scale Texture Complementation Block based on Spatial Relaxation Transformer (MSRT), which leverages features at multiple scales and augments information exchange through cross windows attention computation. In addition, we introduce a loss function based on the prior of texture smoothness transformation, which utilizes the continuity of textures between patches to constrain the generation of more coherent texture information in the reconstructed images. Specifically, we employ learnable compressive sensing technology to extract shallow features from images, preserving image features while reducing feature dimensions and improving computational efficiency. Extensive experiments conducted on multiple benchmark datasets demonstrate that our method outperforms previous state-of-the-art approaches in both qualitative and quantitative evaluations.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 7","pages":"Article 102150"},"PeriodicalIF":5.2000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002398/pdfft?md5=0a1496797663e5c523b9ebe20a3e23aa&pid=1-s2.0-S1319157824002398-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Spatial relaxation transformer for image super-resolution\",\"authors\":\"Yinghua Li , Ying Zhang , Hao Zeng , Jinglu He , Jie Guo\",\"doi\":\"10.1016/j.jksuci.2024.102150\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Transformer-based approaches have demonstrated remarkable performance in image processing tasks due to their ability to model long-range dependencies. Current mainstream Transformer-based methods typically confine self-attention computation within windows to reduce computational burden. However, this constraint may lead to grid artifacts in the reconstructed images due to insufficient cross-window information exchange, particularly in image super-resolution tasks. To address this issue, we propose the Multi-Scale Texture Complementation Block based on Spatial Relaxation Transformer (MSRT), which leverages features at multiple scales and augments information exchange through cross windows attention computation. In addition, we introduce a loss function based on the prior of texture smoothness transformation, which utilizes the continuity of textures between patches to constrain the generation of more coherent texture information in the reconstructed images. Specifically, we employ learnable compressive sensing technology to extract shallow features from images, preserving image features while reducing feature dimensions and improving computational efficiency. Extensive experiments conducted on multiple benchmark datasets demonstrate that our method outperforms previous state-of-the-art approaches in both qualitative and quantitative evaluations.</p></div>\",\"PeriodicalId\":48547,\"journal\":{\"name\":\"Journal of King Saud University-Computer and Information Sciences\",\"volume\":\"36 7\",\"pages\":\"Article 102150\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1319157824002398/pdfft?md5=0a1496797663e5c523b9ebe20a3e23aa&pid=1-s2.0-S1319157824002398-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of King Saud University-Computer and Information Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1319157824002398\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of King Saud University-Computer and Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1319157824002398","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

基于变换器的方法由于能够模拟长距离依赖关系，因此在图像处理任务中表现出卓越的性能。目前基于变换器的主流方法通常将自注意计算限制在窗口内，以减轻计算负担。然而，由于跨窗口信息交换不足，这种限制可能会导致重建图像中出现网格伪影，尤其是在图像超分辨率任务中。为了解决这个问题，我们提出了基于空间松弛变换器（MSRT）的多尺度纹理补全块，它利用了多个尺度的特征，并通过跨窗注意力计算增强了信息交换。此外，我们还引入了基于纹理平滑度变换先验的损失函数，该函数利用斑块间纹理的连续性来限制在重建图像中生成更连贯的纹理信息。具体来说，我们采用可学习的压缩传感技术从图像中提取浅层特征，在保留图像特征的同时减少特征维数并提高计算效率。在多个基准数据集上进行的广泛实验表明，我们的方法在定性和定量评估方面都优于之前的先进方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Spatial relaxation transformer for image super-resolution

Transformer-based approaches have demonstrated remarkable performance in image processing tasks due to their ability to model long-range dependencies. Current mainstream Transformer-based methods typically confine self-attention computation within windows to reduce computational burden. However, this constraint may lead to grid artifacts in the reconstructed images due to insufficient cross-window information exchange, particularly in image super-resolution tasks. To address this issue, we propose the Multi-Scale Texture Complementation Block based on Spatial Relaxation Transformer (MSRT), which leverages features at multiple scales and augments information exchange through cross windows attention computation. In addition, we introduce a loss function based on the prior of texture smoothness transformation, which utilizes the continuity of textures between patches to constrain the generation of more coherent texture information in the reconstructed images. Specifically, we employ learnable compressive sensing technology to extract shallow features from images, preserving image features while reducing feature dimensions and improving computational efficiency. Extensive experiments conducted on multiple benchmark datasets demonstrate that our method outperforms previous state-of-the-art approaches in both qualitative and quantitative evaluations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of King Saud University-Computer and Information Sciences COMPUTER SCIENCE, INFORMATION SYSTEMS-

CiteScore

10.50

自引率

8.70%

发文量

656

审稿时长

29 days

期刊介绍： In 2022 the Journal of King Saud University - Computer and Information Sciences will become an author paid open access journal. Authors who submit their manuscript after October 31st 2021 will be asked to pay an Article Processing Charge (APC) after acceptance of their paper to make their work immediately, permanently, and freely accessible to all. The Journal of King Saud University Computer and Information Sciences is a refereed, international journal that covers all aspects of both foundations of computer and its practical applications.