Luiz Henrique Cancellier, M. Grellert, José Luís Almada Güntzel, L. Cruz
{"title":"Autoencoder Model Exploration for Multi-Layer Video Compression","authors":"Luiz Henrique Cancellier, M. Grellert, José Luís Almada Güntzel, L. Cruz","doi":"10.1109/EUVIP53989.2022.9922780","DOIUrl":null,"url":null,"abstract":"The use of autoencoder models for image and video compression have been explored by a number of works published in recent years. While those works perform the original data compression in a single layer, in this work we propose the use of autoencoder models in two-layered video coding. The adoption of multi-layer encoder provides scalability and allows us for decoupling the traditional video coding implementation from the NN solutions. By restricting the use of the Neural Network (NN) solution in the enhancement layer, it becomes possible to decode the base layer bitstream without the necessity of running the decoding process with the NN. We implemented and evaluated two autoencoder models: one using a symmetric encoder/decoder architecture, and an asymmetric alternative that employs more layers on the decoder side. The models were trained to compress residues for a scenario using All Intra encoding with spatial scalability. The Asymmetric model outperformed the Symmetric one by providing better compression rates and quality results, which is confirmed by the respective BD-Rate and BD-PSNR average results of -17.06% and 0.7dB, respectively.","PeriodicalId":120249,"journal":{"name":"2022 10th European Workshop on Visual Information Processing (EUVIP)","volume":"2001 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 10th European Workshop on Visual Information Processing (EUVIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUVIP53989.2022.9922780","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The use of autoencoder models for image and video compression have been explored by a number of works published in recent years. While those works perform the original data compression in a single layer, in this work we propose the use of autoencoder models in two-layered video coding. The adoption of multi-layer encoder provides scalability and allows us for decoupling the traditional video coding implementation from the NN solutions. By restricting the use of the Neural Network (NN) solution in the enhancement layer, it becomes possible to decode the base layer bitstream without the necessity of running the decoding process with the NN. We implemented and evaluated two autoencoder models: one using a symmetric encoder/decoder architecture, and an asymmetric alternative that employs more layers on the decoder side. The models were trained to compress residues for a scenario using All Intra encoding with spatial scalability. The Asymmetric model outperformed the Symmetric one by providing better compression rates and quality results, which is confirmed by the respective BD-Rate and BD-PSNR average results of -17.06% and 0.7dB, respectively.