An Efficient End-To-End Image Compression Transformer

2022 IEEE International Conference on Image Processing (ICIP) Pub Date : 2022-10-16 DOI:10.1109/ICIP46576.2022.9897663

Afsana Ahsan Jeny, Masum Shah Junayed, Md Baharul Islam

引用次数: 0

Abstract

Image and video compression received significant research attention and expanded their applications. Existing entropy estimation-based methods combine with hyperprior and local context, limiting their efficacy. This paper introduces an efficient end-to-end transformer-based image compression model, which generates a global receptive field to tackle the long-range correlation issues. A hyper encoder-decoder-based transformer block employs a multi-head spatial reduction self-attention (MHSRSA) layer to minimize the computational cost of the self-attention layer and enable rapid learning of multi-scale and high-resolution features. A Casual Global Anticipation Module (CGAM) is designed to construct highly informative adjacent contexts utilizing channel-wise linkages and identify global reference points in the latent space for end-to-end rate-distortion optimization (RDO). Experimental results demonstrate the effectiveness and competitive performance of the KODAK dataset.

查看原文本刊更多论文

一个有效的端到端图像压缩转换器

图像和视频压缩得到了广泛的研究和应用。现有的基于熵估计的方法结合了超先验和局部上下文，限制了其有效性。本文介绍了一种基于端到端变换的高效图像压缩模型，该模型产生一个全局接受场来处理远程相关问题。基于超编码器-解码器的变压器块采用多头空间缩减自注意层(MHSRSA)来最小化自注意层的计算成本，并实现多尺度和高分辨率特征的快速学习。随机全局预测模块(CGAM)旨在利用信道连接构建高信息量的相邻上下文，并在潜在空间中识别端到端速率失真优化(RDO)的全局参考点。实验结果证明了该数据集的有效性和竞争力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on Image Processing (ICIP)

自引率

0.00%

发文量