Exploring the rate-distortion-complexity optimization in neural image compression

IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Yixin Gao, Runsen Feng, Zongyu Guo, Zhibo Chen
{"title":"Exploring the rate-distortion-complexity optimization in neural image compression","authors":"Yixin Gao,&nbsp;Runsen Feng,&nbsp;Zongyu Guo,&nbsp;Zhibo Chen","doi":"10.1016/j.jvcir.2024.104294","DOIUrl":null,"url":null,"abstract":"<div><div>Despite a short history, neural image codecs have been shown to surpass classical image codecs in terms of rate–distortion performance. However, most of them suffer from significantly longer decoding times, which hinders the practical applications of neural image codecs. This issue is especially pronounced when employing an effective yet time-consuming autoregressive context model since it would increase entropy decoding time by orders of magnitude. In this paper, unlike most previous works that pursue optimal RD performance while temporally overlooking the coding complexity, we make a systematical investigation on the rate–distortion-complexity (RDC) optimization in neural image compression. By quantifying the decoding complexity as a factor in the optimization goal, we are now able to precisely control the RDC trade-off and then demonstrate how the rate–distortion performance of neural image codecs could adapt to various complexity demands. Going beyond the investigation of RDC optimization, a variable-complexity neural codec is designed to leverage the spatial dependencies adaptively according to industrial demands, which supports fine-grained complexity adjustment by balancing the RDC tradeoff. By implementing this scheme in a powerful base model, we demonstrate the feasibility and flexibility of RDC optimization for neural image codecs.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"105 ","pages":"Article 104294"},"PeriodicalIF":2.6000,"publicationDate":"2024-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047320324002505","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Despite a short history, neural image codecs have been shown to surpass classical image codecs in terms of rate–distortion performance. However, most of them suffer from significantly longer decoding times, which hinders the practical applications of neural image codecs. This issue is especially pronounced when employing an effective yet time-consuming autoregressive context model since it would increase entropy decoding time by orders of magnitude. In this paper, unlike most previous works that pursue optimal RD performance while temporally overlooking the coding complexity, we make a systematical investigation on the rate–distortion-complexity (RDC) optimization in neural image compression. By quantifying the decoding complexity as a factor in the optimization goal, we are now able to precisely control the RDC trade-off and then demonstrate how the rate–distortion performance of neural image codecs could adapt to various complexity demands. Going beyond the investigation of RDC optimization, a variable-complexity neural codec is designed to leverage the spatial dependencies adaptively according to industrial demands, which supports fine-grained complexity adjustment by balancing the RDC tradeoff. By implementing this scheme in a powerful base model, we demonstrate the feasibility and flexibility of RDC optimization for neural image codecs.
探索神经图像压缩中的速率-失真-复杂性优化
尽管神经图像编解码技术问世时间不长,但其在速率-失真性能方面已经超越了传统图像编解码技术。然而,大多数神经图像编解码器的解码时间明显较长,这阻碍了神经图像编解码器的实际应用。当采用有效但耗时的自回归上下文模型时,这一问题尤为突出,因为这会使熵解码时间增加几个数量级。在本文中,与以往大多数追求最佳 RD 性能而忽视编码复杂性的研究不同,我们对神经图像压缩中的速率-失真-复杂性(RDC)优化进行了系统研究。通过将解码复杂度量化为优化目标中的一个因素,我们现在能够精确控制 RDC 的权衡,进而证明神经图像编解码器的速率-失真性能如何适应各种复杂度需求。在研究 RDC 优化的基础上,我们设计了一种可变复杂度神经编解码器,可根据行业需求自适应地利用空间依赖性,通过平衡 RDC 权衡来支持复杂度的细粒度调整。通过在一个强大的基础模型中实施这一方案,我们证明了神经图像编解码器 RDC 优化的可行性和灵活性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Visual Communication and Image Representation
Journal of Visual Communication and Image Representation 工程技术-计算机:软件工程
CiteScore
5.40
自引率
11.50%
发文量
188
审稿时长
9.9 months
期刊介绍: The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信