{"title":"Universal End-to-End Neural Network for Lossy Image Compression","authors":"Bouzid Arezki, Fangchen Feng, Anissa Mokraoui","doi":"arxiv-2409.06586","DOIUrl":null,"url":null,"abstract":"This paper presents variable bitrate lossy image compression using a\nVAE-based neural network. An adaptable image quality adjustment strategy is\nproposed. The key innovation involves adeptly adjusting the input scale\nexclusively during the inference process, resulting in an exceptionally\nefficient rate-distortion mechanism. Through extensive experimentation, across\ndiverse VAE-based compression architectures (CNN, ViT) and training\nmethodologies (MSE, SSIM), our approach exhibits remarkable universality. This\nsuccess is attributed to the inherent generalization capacity of neural\nnetworks. Unlike methods that adjust model architecture or loss functions, our\napproach emphasizes simplicity, reducing computational complexity and memory\nrequirements. The experiments not only highlight the effectiveness of our\napproach but also indicate its potential to drive advancements in variable-rate\nneural network lossy image compression methodologies.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06586","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents variable bitrate lossy image compression using a
VAE-based neural network. An adaptable image quality adjustment strategy is
proposed. The key innovation involves adeptly adjusting the input scale
exclusively during the inference process, resulting in an exceptionally
efficient rate-distortion mechanism. Through extensive experimentation, across
diverse VAE-based compression architectures (CNN, ViT) and training
methodologies (MSE, SSIM), our approach exhibits remarkable universality. This
success is attributed to the inherent generalization capacity of neural
networks. Unlike methods that adjust model architecture or loss functions, our
approach emphasizes simplicity, reducing computational complexity and memory
requirements. The experiments not only highlight the effectiveness of our
approach but also indicate its potential to drive advancements in variable-rate
neural network lossy image compression methodologies.