Mu Li;Youneng Bao;Xiaohang Sui;Jinxing Li;Guangming Lu;Yong Xu
{"title":"Learning Content-Weighted Pseudocylindrical Representation for 360° Image Compression","authors":"Mu Li;Youneng Bao;Xiaohang Sui;Jinxing Li;Guangming Lu;Yong Xu","doi":"10.1109/TIP.2024.3477356","DOIUrl":null,"url":null,"abstract":"Learned 360° image compression methods using equirectangular projection (ERP) often confront a non-uniform sampling issue, inherent to sphere-to-rectangle projection. While uniformly or nearly uniformly sampling representations, along with their corresponding convolution operations, have been proposed to mitigate this issue, these methods often concentrate solely on uniform sampling rates, thus neglecting the content of the image. In this paper, we urge that different contents within 360° images have varying significance and advocate for the adoption of a content-adaptive parametric representation in 360° image compression, which takes into account both the content and sampling rate. We first introduce the parametric pseudocylindrical representation and corresponding convolution operation, upon which we build a learned 360° image codec. Then, we model the hyperparameter of the representation as the output of a network, derived from the image’s content and its spherical coordinates. We treat the optimization of hyperparameters for different 360° images as distinct compression tasks and propose a meta-learning algorithm to jointly optimize the codec and the metaknowledge, i.e., the hyperparameter estimation network. A significant challenge is the lack of a direct derivative from the compression loss to the hyperparameter network. To address this, we present a novel method to relax the rate-distortion loss as a function of the hyperparameters, enabling gradient-based optimization of the metaknowledge. Experimental results on omnidirectional images demonstrate that our method achieves state-of-the-art performance and superior visual quality.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5975-5988"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10721338/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Learned 360° image compression methods using equirectangular projection (ERP) often confront a non-uniform sampling issue, inherent to sphere-to-rectangle projection. While uniformly or nearly uniformly sampling representations, along with their corresponding convolution operations, have been proposed to mitigate this issue, these methods often concentrate solely on uniform sampling rates, thus neglecting the content of the image. In this paper, we urge that different contents within 360° images have varying significance and advocate for the adoption of a content-adaptive parametric representation in 360° image compression, which takes into account both the content and sampling rate. We first introduce the parametric pseudocylindrical representation and corresponding convolution operation, upon which we build a learned 360° image codec. Then, we model the hyperparameter of the representation as the output of a network, derived from the image’s content and its spherical coordinates. We treat the optimization of hyperparameters for different 360° images as distinct compression tasks and propose a meta-learning algorithm to jointly optimize the codec and the metaknowledge, i.e., the hyperparameter estimation network. A significant challenge is the lack of a direct derivative from the compression loss to the hyperparameter network. To address this, we present a novel method to relax the rate-distortion loss as a function of the hyperparameters, enabling gradient-based optimization of the metaknowledge. Experimental results on omnidirectional images demonstrate that our method achieves state-of-the-art performance and superior visual quality.