Learning Content-Weighted Pseudocylindrical Representation for 360° Image Compression

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2024-10-17 DOI:10.1109/TIP.2024.3477356

Mu Li;Youneng Bao;Xiaohang Sui;Jinxing Li;Guangming Lu;Yong Xu

{"title":"Learning Content-Weighted Pseudocylindrical Representation for 360° Image Compression","authors":"Mu Li;Youneng Bao;Xiaohang Sui;Jinxing Li;Guangming Lu;Yong Xu","doi":"10.1109/TIP.2024.3477356","DOIUrl":null,"url":null,"abstract":"Learned 360° image compression methods using equirectangular projection (ERP) often confront a non-uniform sampling issue, inherent to sphere-to-rectangle projection. While uniformly or nearly uniformly sampling representations, along with their corresponding convolution operations, have been proposed to mitigate this issue, these methods often concentrate solely on uniform sampling rates, thus neglecting the content of the image. In this paper, we urge that different contents within 360° images have varying significance and advocate for the adoption of a content-adaptive parametric representation in 360° image compression, which takes into account both the content and sampling rate. We first introduce the parametric pseudocylindrical representation and corresponding convolution operation, upon which we build a learned 360° image codec. Then, we model the hyperparameter of the representation as the output of a network, derived from the image’s content and its spherical coordinates. We treat the optimization of hyperparameters for different 360° images as distinct compression tasks and propose a meta-learning algorithm to jointly optimize the codec and the metaknowledge, i.e., the hyperparameter estimation network. A significant challenge is the lack of a direct derivative from the compression loss to the hyperparameter network. To address this, we present a novel method to relax the rate-distortion loss as a function of the hyperparameters, enabling gradient-based optimization of the metaknowledge. Experimental results on omnidirectional images demonstrate that our method achieves state-of-the-art performance and superior visual quality.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5975-5988"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10721338/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Learned 360° image compression methods using equirectangular projection (ERP) often confront a non-uniform sampling issue, inherent to sphere-to-rectangle projection. While uniformly or nearly uniformly sampling representations, along with their corresponding convolution operations, have been proposed to mitigate this issue, these methods often concentrate solely on uniform sampling rates, thus neglecting the content of the image. In this paper, we urge that different contents within 360° images have varying significance and advocate for the adoption of a content-adaptive parametric representation in 360° image compression, which takes into account both the content and sampling rate. We first introduce the parametric pseudocylindrical representation and corresponding convolution operation, upon which we build a learned 360° image codec. Then, we model the hyperparameter of the representation as the output of a network, derived from the image’s content and its spherical coordinates. We treat the optimization of hyperparameters for different 360° images as distinct compression tasks and propose a meta-learning algorithm to jointly optimize the codec and the metaknowledge, i.e., the hyperparameter estimation network. A significant challenge is the lack of a direct derivative from the compression loss to the hyperparameter network. To address this, we present a novel method to relax the rate-distortion loss as a function of the hyperparameters, enabling gradient-based optimization of the metaknowledge. Experimental results on omnidirectional images demonstrate that our method achieves state-of-the-art performance and superior visual quality.

查看原文本刊更多论文

学习内容加权伪圆柱表示法以实现 360° 图像压缩

使用等角投影（ERP）学习的 360° 图像压缩方法经常会遇到球到角投影固有的非均匀采样问题。虽然已经提出了均匀或接近均匀采样表示法以及相应的卷积运算来缓解这一问题，但这些方法往往只关注均匀采样率，从而忽略了图像的内容。在本文中，我们认为 360° 图像中的不同内容具有不同的意义，并主张在 360° 图像压缩中采用内容自适应参数表示法，这种表示法同时考虑了内容和采样率。我们首先介绍了参数伪圆柱表示法和相应的卷积运算，并在此基础上建立了学习型 360° 图像编解码器。然后，我们根据图像内容及其球面坐标，将表示的超参数建模为网络输出。我们将优化不同 360° 图像的超参数视为不同的压缩任务，并提出了一种元学习算法，用于联合优化编解码器和元知识（即超参数估计网络）。一个重大挑战是缺乏从压缩损失到超参数网络的直接导数。为了解决这个问题，我们提出了一种新方法，将速率失真损失作为超参数的函数进行放松，从而实现基于梯度的元知识优化。全向图像的实验结果表明，我们的方法实现了最先进的性能和卓越的视觉质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量