Jonas Löhdefink, Fabian Hüger, Peter Schlicht, T. Fingscheidt
{"title":"学习图像压缩的标量和矢量量化:不同空间中MSE和GAN损失影响的研究","authors":"Jonas Löhdefink, Fabian Hüger, Peter Schlicht, T. Fingscheidt","doi":"10.1109/ITSC45102.2020.9294350","DOIUrl":null,"url":null,"abstract":"Recently, learned image compression by means of neural networks has experienced a performance boost by the use of adversarial loss functions. Typically, a generative adversarial network (GAN) is designed with the generator being an autoencoder with quantizer in the bottleneck for compression and reconstruction. It is well known from rate-distortion theory that vector quantizers provide lower quantization errors than scalar quantizers at the same bitrate. Still, learned image compression approaches often use scalar quantization instead. In this work we provide insights into the image reconstruction quality of the often-employed uniform scalar quantizers, non-uniform scalar quantizers, and the rarely employed but bitrate-efficient vector quantizers, all being integrated into backpropagation and operating under the exact same bitrate. Further interesting insights are obtained by our investigation of an MSE loss and a GAN loss. We show that vector quantization is always beneficial for the compression performance both in the latent space and the reconstructed image space. However, image samples demonstrate that the GAN loss produces the more pleasing reconstructed images, while the non-adversarial MSE loss provides better quality scores of various instrumental measures both in the latent space and on the reconstructed images.","PeriodicalId":394538,"journal":{"name":"2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Scalar and Vector Quantization for Learned Image Compression: A Study on the Effects of MSE and GAN Loss in Various Spaces\",\"authors\":\"Jonas Löhdefink, Fabian Hüger, Peter Schlicht, T. Fingscheidt\",\"doi\":\"10.1109/ITSC45102.2020.9294350\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, learned image compression by means of neural networks has experienced a performance boost by the use of adversarial loss functions. Typically, a generative adversarial network (GAN) is designed with the generator being an autoencoder with quantizer in the bottleneck for compression and reconstruction. It is well known from rate-distortion theory that vector quantizers provide lower quantization errors than scalar quantizers at the same bitrate. Still, learned image compression approaches often use scalar quantization instead. In this work we provide insights into the image reconstruction quality of the often-employed uniform scalar quantizers, non-uniform scalar quantizers, and the rarely employed but bitrate-efficient vector quantizers, all being integrated into backpropagation and operating under the exact same bitrate. Further interesting insights are obtained by our investigation of an MSE loss and a GAN loss. We show that vector quantization is always beneficial for the compression performance both in the latent space and the reconstructed image space. However, image samples demonstrate that the GAN loss produces the more pleasing reconstructed images, while the non-adversarial MSE loss provides better quality scores of various instrumental measures both in the latent space and on the reconstructed images.\",\"PeriodicalId\":394538,\"journal\":{\"name\":\"2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITSC45102.2020.9294350\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITSC45102.2020.9294350","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Scalar and Vector Quantization for Learned Image Compression: A Study on the Effects of MSE and GAN Loss in Various Spaces
Recently, learned image compression by means of neural networks has experienced a performance boost by the use of adversarial loss functions. Typically, a generative adversarial network (GAN) is designed with the generator being an autoencoder with quantizer in the bottleneck for compression and reconstruction. It is well known from rate-distortion theory that vector quantizers provide lower quantization errors than scalar quantizers at the same bitrate. Still, learned image compression approaches often use scalar quantization instead. In this work we provide insights into the image reconstruction quality of the often-employed uniform scalar quantizers, non-uniform scalar quantizers, and the rarely employed but bitrate-efficient vector quantizers, all being integrated into backpropagation and operating under the exact same bitrate. Further interesting insights are obtained by our investigation of an MSE loss and a GAN loss. We show that vector quantization is always beneficial for the compression performance both in the latent space and the reconstructed image space. However, image samples demonstrate that the GAN loss produces the more pleasing reconstructed images, while the non-adversarial MSE loss provides better quality scores of various instrumental measures both in the latent space and on the reconstructed images.