{"title":"面向学习全向图像压缩的自适应纬度感知和重要激活变换编码","authors":"Hui Hu;Yunhui Shi;Jin Wang;Nam Ling;Baocai Yin","doi":"10.1109/TBC.2025.3565895","DOIUrl":null,"url":null,"abstract":"Based on the measured latitude and longitude, users can freely view different perspectives of the omnidirectional image. Typically, omnidirectional images are represented in the equirectangular projection (ERP) format. Although ERP images suffer from distortion and redundancy due to oversampling, making traditional codec inefficient, they maintain visual consistency and enhance compatibility with deep learning-based image processing tools. This has led to the emergence of end-to-end omnidirectional image compression methods based on the ERP format. In fact, transform coding, a key component in learned planar image compression, has not yet been fully explored in the domain of learned omnidirectional image compression. In this paper, we propose a transform coding method with adaptive latitude-aware and importance-activated features for omnidirectional image compression. Specifically, the adaptive latitude-aware mechanism comprises two modules. The first module, termed Adaptive Latitude-aware Module (ALAM), employs rectangular dilated convolutional kernels of multiple sizes to perceive distortion redundancy across different latitudes, followed by latitude-adaptive weighting to select optimal features for respective latitudes. The second module, named Multi-scale Convolutional Gated Feedforward Network (MCGFN), fully exploits local contextual information while suppressing feature redundancy induced by diverse dilated convolutions in the first module. Furthermore, to further reduce ERP redundancy, we design an importance-activated spatial feature transform module that regulates latent representations to allocate more bits to significant regions. Experimental results demonstrate that our proposed method outperforms existing VVC standards and learning-based omnidirectional image compression approaches at medium-to-high bitrates while maintaining low computational complexity.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 3","pages":"874-888"},"PeriodicalIF":4.8000,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive Latitude-Aware and Importance-Activated Transform Coding for Learned Omnidirectional Image Compression\",\"authors\":\"Hui Hu;Yunhui Shi;Jin Wang;Nam Ling;Baocai Yin\",\"doi\":\"10.1109/TBC.2025.3565895\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Based on the measured latitude and longitude, users can freely view different perspectives of the omnidirectional image. Typically, omnidirectional images are represented in the equirectangular projection (ERP) format. Although ERP images suffer from distortion and redundancy due to oversampling, making traditional codec inefficient, they maintain visual consistency and enhance compatibility with deep learning-based image processing tools. This has led to the emergence of end-to-end omnidirectional image compression methods based on the ERP format. In fact, transform coding, a key component in learned planar image compression, has not yet been fully explored in the domain of learned omnidirectional image compression. In this paper, we propose a transform coding method with adaptive latitude-aware and importance-activated features for omnidirectional image compression. Specifically, the adaptive latitude-aware mechanism comprises two modules. The first module, termed Adaptive Latitude-aware Module (ALAM), employs rectangular dilated convolutional kernels of multiple sizes to perceive distortion redundancy across different latitudes, followed by latitude-adaptive weighting to select optimal features for respective latitudes. The second module, named Multi-scale Convolutional Gated Feedforward Network (MCGFN), fully exploits local contextual information while suppressing feature redundancy induced by diverse dilated convolutions in the first module. Furthermore, to further reduce ERP redundancy, we design an importance-activated spatial feature transform module that regulates latent representations to allocate more bits to significant regions. Experimental results demonstrate that our proposed method outperforms existing VVC standards and learning-based omnidirectional image compression approaches at medium-to-high bitrates while maintaining low computational complexity.\",\"PeriodicalId\":13159,\"journal\":{\"name\":\"IEEE Transactions on Broadcasting\",\"volume\":\"71 3\",\"pages\":\"874-888\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2025-03-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Broadcasting\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11005398/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Broadcasting","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11005398/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Adaptive Latitude-Aware and Importance-Activated Transform Coding for Learned Omnidirectional Image Compression
Based on the measured latitude and longitude, users can freely view different perspectives of the omnidirectional image. Typically, omnidirectional images are represented in the equirectangular projection (ERP) format. Although ERP images suffer from distortion and redundancy due to oversampling, making traditional codec inefficient, they maintain visual consistency and enhance compatibility with deep learning-based image processing tools. This has led to the emergence of end-to-end omnidirectional image compression methods based on the ERP format. In fact, transform coding, a key component in learned planar image compression, has not yet been fully explored in the domain of learned omnidirectional image compression. In this paper, we propose a transform coding method with adaptive latitude-aware and importance-activated features for omnidirectional image compression. Specifically, the adaptive latitude-aware mechanism comprises two modules. The first module, termed Adaptive Latitude-aware Module (ALAM), employs rectangular dilated convolutional kernels of multiple sizes to perceive distortion redundancy across different latitudes, followed by latitude-adaptive weighting to select optimal features for respective latitudes. The second module, named Multi-scale Convolutional Gated Feedforward Network (MCGFN), fully exploits local contextual information while suppressing feature redundancy induced by diverse dilated convolutions in the first module. Furthermore, to further reduce ERP redundancy, we design an importance-activated spatial feature transform module that regulates latent representations to allocate more bits to significant regions. Experimental results demonstrate that our proposed method outperforms existing VVC standards and learning-based omnidirectional image compression approaches at medium-to-high bitrates while maintaining low computational complexity.
期刊介绍:
The Society’s Field of Interest is “Devices, equipment, techniques and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.” In addition to this formal FOI statement, which is used to provide guidance to the Publications Committee in the selection of content, the AdCom has further resolved that “broadcast systems includes all aspects of transmission, propagation, and reception.”