Transformer based Douglas-Rachford unrolling network for compressed sensing

IF 2.7 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Signal Processing-Image Communication Pub Date : 2024-05-24 DOI:10.1016/j.image.2024.117153

Yueming Su , Qiusheng Lian , Dan Zhang , Baoshun Shi

{"title":"Transformer based Douglas-Rachford unrolling network for compressed sensing","authors":"Yueming Su , Qiusheng Lian , Dan Zhang , Baoshun Shi","doi":"10.1016/j.image.2024.117153","DOIUrl":null,"url":null,"abstract":"<div><p>Compressed sensing (CS) with the binary sampling matrix is hardware-friendly and memory-saving in the signal processing field. Existing Convolutional Neural Network (CNN)-based CS methods show potential restrictions in exploiting non-local similarity and lack interpretability. In parallel, the emerging Transformer architecture performs well in modelling long-range correlations. To further improve the CS reconstruction quality from highly under-sampled CS measurements, a Transformer based deep unrolling reconstruction network abbreviated as DR-TransNet is proposed, whose design is inspired by the traditional iterative Douglas-Rachford algorithm. It combines the merits of structure insights of optimization-based methods and the speed of the network-based ones. Therein, a U-type Transformer based proximal sub-network is elaborated to reconstruct images in the wavelet domain and the spatial domain as an auxiliary mode, which aims to explore local informative details and global long-term interaction of the images. Specially, a flexible single model is trained to address the CS reconstruction with different binary CS sampling ratios. Compared with the state-of-the-art CS reconstruction methods with the binary sampling matrix, the proposed method can achieve appealing improvements in terms of Peak Signal to Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM) and visual metrics. Codes are available at <span>https://github.com/svyueming/DR-TransNet</span><svg><path></path></svg>.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"127 ","pages":"Article 117153"},"PeriodicalIF":2.7000,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing-Image Communication","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0923596524000547","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Compressed sensing (CS) with the binary sampling matrix is hardware-friendly and memory-saving in the signal processing field. Existing Convolutional Neural Network (CNN)-based CS methods show potential restrictions in exploiting non-local similarity and lack interpretability. In parallel, the emerging Transformer architecture performs well in modelling long-range correlations. To further improve the CS reconstruction quality from highly under-sampled CS measurements, a Transformer based deep unrolling reconstruction network abbreviated as DR-TransNet is proposed, whose design is inspired by the traditional iterative Douglas-Rachford algorithm. It combines the merits of structure insights of optimization-based methods and the speed of the network-based ones. Therein, a U-type Transformer based proximal sub-network is elaborated to reconstruct images in the wavelet domain and the spatial domain as an auxiliary mode, which aims to explore local informative details and global long-term interaction of the images. Specially, a flexible single model is trained to address the CS reconstruction with different binary CS sampling ratios. Compared with the state-of-the-art CS reconstruction methods with the binary sampling matrix, the proposed method can achieve appealing improvements in terms of Peak Signal to Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM) and visual metrics. Codes are available at https://github.com/svyueming/DR-TransNet.

查看原文本刊更多论文

基于变压器的压缩传感道格拉斯-拉赫福德展开网络

在信号处理领域，采用二进制采样矩阵的压缩传感（CS）既方便硬件，又节省内存。现有的基于卷积神经网络（CNN）的压缩传感方法在利用非局部相似性方面存在潜在限制，并且缺乏可解释性。与此同时，新兴的 Transformer 架构在模拟长距离相关性方面表现出色。为了进一步提高高度采样不足的 CS 测量的 CS 重建质量，我们提出了一种基于 Transformer 的深度开卷重建网络，简称 DR-TransNet，其设计灵感来自传统的迭代 Douglas-Rachford 算法。它结合了基于优化方法的结构洞察力和基于网络方法的速度优势。其中，详细阐述了基于 U 型变换器的近端子网络，以小波域和空间域作为辅助模式重建图像，旨在探索图像的局部信息细节和全局长期交互。特别是，针对不同二元 CS 采样比的 CS 重建，训练了一个灵活的单一模型。与采用二进制采样矩阵的最先进 CS 重建方法相比，所提出的方法在峰值信噪比（PSNR）、结构相似度指数（SSIM）和视觉指标方面都取得了令人满意的改进。代码见 https://github.com/svyueming/DR-TransNet。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Signal Processing-Image Communication 工程技术-工程：电子与电气

CiteScore

8.40

自引率

2.90%

发文量

138

审稿时长

5.2 months

期刊介绍： Signal Processing: Image Communication is an international journal for the development of the theory and practice of image communication. Its primary objectives are the following: To present a forum for the advancement of theory and practice of image communication. To stimulate cross-fertilization between areas similar in nature which have traditionally been separated, for example, various aspects of visual communications and information systems. To contribute to a rapid information exchange between the industrial and academic environments. The editorial policy and the technical content of the journal are the responsibility of the Editor-in-Chief, the Area Editors and the Advisory Editors. The Journal is self-supporting from subscription income and contains a minimum amount of advertisements. Advertisements are subject to the prior approval of the Editor-in-Chief. The journal welcomes contributions from every country in the world. Signal Processing: Image Communication publishes articles relating to aspects of the design, implementation and use of image communication systems. The journal features original research work, tutorial and review articles, and accounts of practical developments. Subjects of interest include image/video coding, 3D video representations and compression, 3D graphics and animation compression, HDTV and 3DTV systems, video adaptation, video over IP, peer-to-peer video networking, interactive visual communication, multi-user video conferencing, wireless video broadcasting and communication, visual surveillance, 2D and 3D image/video quality measures, pre/post processing, video restoration and super-resolution, multi-camera video analysis, motion analysis, content-based image/video indexing and retrieval, face and gesture processing, video synthesis, 2D and 3D image/video acquisition and display technologies, architectures for image/video processing and communication.