{"title":"率失真优化的JPEG压缩深度预处理","authors":"Fan Ye;Bojun Liu;Li Li;Dong Liu","doi":"10.1109/TCSVT.2025.3550872","DOIUrl":null,"url":null,"abstract":"JPEG is daily used for compressing natural images, while the compressed images often contain visually annoying artifacts especially at low rates. To reduce the compression artifacts, it has been proposed to preprocess an image before the JPEG compression with the help of deep learning, which maintains the standard compliance. However, the existing methods were not fully justified from the rate-distortion optimization perspective. We address this limitation and propose a truly rate-distortion-optimized deep preprocessing method for JPEG compression. We decompose a rate-distortion cost into three parts: rate, distortion, and Lagrangian multiplier. First, we design a rate estimation network and propose to train the network to estimate the JPEG compression rate. Second, we propose to estimate the actual end-to-end distortion (between original and reconstructed images) with a differentiable JPEG simulator, where we specifically design an adaptive discrete cosine transform (DCT) domain masking algorithm. Third, we propose to estimate the actual content-dependent Lagrangian multipliers to combine rate and distortion into a joint loss function that drives the training of the preprocessing network. Our method makes no change to the JPEG encoder and decoder and supports any differentiable distortion measure (e.g. MSE, MS-SSIM, LPIPS). On the Kodak dataset, our method achieves on average 7.59% BD-rate reduction compared to the JPEG baseline when using MSE. With per-image optimization for LPIPS, our method achieves as high as 38.65% BD-rate reduction, and produces high-quality reconstructed images with much less artifacts.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 8","pages":"8330-8343"},"PeriodicalIF":11.1000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rate-Distortion-Optimized Deep Preprocessing for JPEG Compression\",\"authors\":\"Fan Ye;Bojun Liu;Li Li;Dong Liu\",\"doi\":\"10.1109/TCSVT.2025.3550872\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"JPEG is daily used for compressing natural images, while the compressed images often contain visually annoying artifacts especially at low rates. To reduce the compression artifacts, it has been proposed to preprocess an image before the JPEG compression with the help of deep learning, which maintains the standard compliance. However, the existing methods were not fully justified from the rate-distortion optimization perspective. We address this limitation and propose a truly rate-distortion-optimized deep preprocessing method for JPEG compression. We decompose a rate-distortion cost into three parts: rate, distortion, and Lagrangian multiplier. First, we design a rate estimation network and propose to train the network to estimate the JPEG compression rate. Second, we propose to estimate the actual end-to-end distortion (between original and reconstructed images) with a differentiable JPEG simulator, where we specifically design an adaptive discrete cosine transform (DCT) domain masking algorithm. Third, we propose to estimate the actual content-dependent Lagrangian multipliers to combine rate and distortion into a joint loss function that drives the training of the preprocessing network. Our method makes no change to the JPEG encoder and decoder and supports any differentiable distortion measure (e.g. MSE, MS-SSIM, LPIPS). On the Kodak dataset, our method achieves on average 7.59% BD-rate reduction compared to the JPEG baseline when using MSE. With per-image optimization for LPIPS, our method achieves as high as 38.65% BD-rate reduction, and produces high-quality reconstructed images with much less artifacts.\",\"PeriodicalId\":13082,\"journal\":{\"name\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"volume\":\"35 8\",\"pages\":\"8330-8343\"},\"PeriodicalIF\":11.1000,\"publicationDate\":\"2025-03-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10925482/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10925482/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Rate-Distortion-Optimized Deep Preprocessing for JPEG Compression
JPEG is daily used for compressing natural images, while the compressed images often contain visually annoying artifacts especially at low rates. To reduce the compression artifacts, it has been proposed to preprocess an image before the JPEG compression with the help of deep learning, which maintains the standard compliance. However, the existing methods were not fully justified from the rate-distortion optimization perspective. We address this limitation and propose a truly rate-distortion-optimized deep preprocessing method for JPEG compression. We decompose a rate-distortion cost into three parts: rate, distortion, and Lagrangian multiplier. First, we design a rate estimation network and propose to train the network to estimate the JPEG compression rate. Second, we propose to estimate the actual end-to-end distortion (between original and reconstructed images) with a differentiable JPEG simulator, where we specifically design an adaptive discrete cosine transform (DCT) domain masking algorithm. Third, we propose to estimate the actual content-dependent Lagrangian multipliers to combine rate and distortion into a joint loss function that drives the training of the preprocessing network. Our method makes no change to the JPEG encoder and decoder and supports any differentiable distortion measure (e.g. MSE, MS-SSIM, LPIPS). On the Kodak dataset, our method achieves on average 7.59% BD-rate reduction compared to the JPEG baseline when using MSE. With per-image optimization for LPIPS, our method achieves as high as 38.65% BD-rate reduction, and produces high-quality reconstructed images with much less artifacts.
期刊介绍:
The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.