基于自适应信道和基于窗口的空间熵模型的学习图像压缩

IF 4.3 2区计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Consumer Electronics Pub Date : 2024-10-23 DOI:10.1109/TCE.2024.3485179

Jian Wang;Qiang Ling

{"title":"基于自适应信道和基于窗口的空间熵模型的学习图像压缩","authors":"Jian Wang;Qiang Ling","doi":"10.1109/TCE.2024.3485179","DOIUrl":null,"url":null,"abstract":"Image compression is essential for reducing the cost to save or transmit images. Recently, learned image compression methods have achieved superior compression performance compared to traditional image compression standards. Many learned image compression methods utilize convolutional entropy models to remove local spatial and channel redundancy in the latent representation. Some recent methods incorporate transformer to further eliminate non-local redundancy. However, these methods employ the same transformer structure to model both spatial and channel correlations, thereby failing to take advantage of the difference between the spatial characteristics and the channel characteristics of the latent representation. To resolve this issue, we propose novel adaptive channel and window-based spatial entropy models. The adaptive channel entropy model, which consists of the channel transformer module and the channel excitation module, dynamically fuses and excites channel information to implicitly predict channel context. More specifically, we first establish the relationship between the decoded channels and the channels to be encoded. Based on that channel relationship, the channel transformer module adaptively updates the predicted channel context. Finally, the channel excitation module is employed to emphasize informative channel context and suppress irrelevant channel context. Furthermore, we introduce a window-based spatial entropy model to capture global semantic information within the window and generate the spatial context of non-anchor features based on the decoded anchor features. The spatial context and channel context are combined to predict the Gaussian parameters of the latent representation. Experimental results demonstrate that our method outperforms some state-of-the-art image compression methods on Kodak, CLIC and Tecnick datasets.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"70 4","pages":"6430-6441"},"PeriodicalIF":4.3000,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learned Image Compression With Adaptive Channel and Window-Based Spatial Entropy Models\",\"authors\":\"Jian Wang;Qiang Ling\",\"doi\":\"10.1109/TCE.2024.3485179\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image compression is essential for reducing the cost to save or transmit images. Recently, learned image compression methods have achieved superior compression performance compared to traditional image compression standards. Many learned image compression methods utilize convolutional entropy models to remove local spatial and channel redundancy in the latent representation. Some recent methods incorporate transformer to further eliminate non-local redundancy. However, these methods employ the same transformer structure to model both spatial and channel correlations, thereby failing to take advantage of the difference between the spatial characteristics and the channel characteristics of the latent representation. To resolve this issue, we propose novel adaptive channel and window-based spatial entropy models. The adaptive channel entropy model, which consists of the channel transformer module and the channel excitation module, dynamically fuses and excites channel information to implicitly predict channel context. More specifically, we first establish the relationship between the decoded channels and the channels to be encoded. Based on that channel relationship, the channel transformer module adaptively updates the predicted channel context. Finally, the channel excitation module is employed to emphasize informative channel context and suppress irrelevant channel context. Furthermore, we introduce a window-based spatial entropy model to capture global semantic information within the window and generate the spatial context of non-anchor features based on the decoded anchor features. The spatial context and channel context are combined to predict the Gaussian parameters of the latent representation. Experimental results demonstrate that our method outperforms some state-of-the-art image compression methods on Kodak, CLIC and Tecnick datasets.\",\"PeriodicalId\":13208,\"journal\":{\"name\":\"IEEE Transactions on Consumer Electronics\",\"volume\":\"70 4\",\"pages\":\"6430-6441\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-10-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Consumer Electronics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10730794/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10730794/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

图像压缩对于降低保存或传输图像的成本至关重要。近年来，与传统的图像压缩标准相比，学习得到的图像压缩方法具有更优越的压缩性能。许多学习图像压缩方法利用卷积熵模型去除潜在表示中的局部空间冗余和信道冗余。最近的一些方法采用变压器来进一步消除非局部冗余。然而，这些方法采用相同的变压器结构来模拟空间和信道相关性，因此未能利用潜在表示的空间特征和信道特征之间的差异。为了解决这个问题，我们提出了新的自适应通道和基于窗口的空间熵模型。自适应信道熵模型由信道变压器模块和信道激励模块组成，动态融合和激励信道信息，隐式预测信道上下文。更具体地说，我们首先建立已解码信道和待编码信道之间的关系。基于该通道关系，通道变压器模块自适应地更新预测的通道上下文。最后，利用通道激励模块强调信息通道上下文，抑制无关通道上下文。此外，我们引入了一个基于窗口的空间熵模型来捕获窗口内的全局语义信息，并基于解码的锚点特征生成非锚点特征的空间上下文。结合空间上下文和信道上下文来预测潜在表示的高斯参数。实验结果表明，我们的方法在柯达，CLIC和Tecnick数据集上优于一些最先进的图像压缩方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learned Image Compression With Adaptive Channel and Window-Based Spatial Entropy Models

Image compression is essential for reducing the cost to save or transmit images. Recently, learned image compression methods have achieved superior compression performance compared to traditional image compression standards. Many learned image compression methods utilize convolutional entropy models to remove local spatial and channel redundancy in the latent representation. Some recent methods incorporate transformer to further eliminate non-local redundancy. However, these methods employ the same transformer structure to model both spatial and channel correlations, thereby failing to take advantage of the difference between the spatial characteristics and the channel characteristics of the latent representation. To resolve this issue, we propose novel adaptive channel and window-based spatial entropy models. The adaptive channel entropy model, which consists of the channel transformer module and the channel excitation module, dynamically fuses and excites channel information to implicitly predict channel context. More specifically, we first establish the relationship between the decoded channels and the channels to be encoded. Based on that channel relationship, the channel transformer module adaptively updates the predicted channel context. Finally, the channel excitation module is employed to emphasize informative channel context and suppress irrelevant channel context. Furthermore, we introduce a window-based spatial entropy model to capture global semantic information within the window and generate the spatial context of non-anchor features based on the decoded anchor features. The spatial context and channel context are combined to predict the Gaussian parameters of the latent representation. Experimental results demonstrate that our method outperforms some state-of-the-art image compression methods on Kodak, CLIC and Tecnick datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Consumer Electronics 工程技术-电信学

CiteScore

7.70

自引率

9.30%

发文量

审稿时长

3.3 months

期刊介绍： The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.