Yingwen Zhang;Meng Wang;Junru Li;Kai Zhang;Li Zhang;Shiqi Wang
{"title":"A Theoretical and Experimental Study for Dependent Learned Rate-Distortion Optimization","authors":"Yingwen Zhang;Meng Wang;Junru Li;Kai Zhang;Li Zhang;Shiqi Wang","doi":"10.1109/TCSVT.2025.3555152","DOIUrl":null,"url":null,"abstract":"Recent advancements in learned rate-distortion optimization (RDO) showcase that by making the intra coding decisions based on a learned measure, the encoding can be significantly accelerated without incurring much coding loss. Despite great progress in complexity reduction, the dependency issue has been largely neglected in the current learned RDO research. In this study, aiming to tap the full potential of dependent learned RDO, we first derive a probabilistic RDO framework for theoretical analysis, under which the classic and the learned RDO problems are equivalent to the maximum a posteriori (MAP) inference and the distribution imitation, respectively. Subsequently, we probabilistically revisit dependency considerations in the intra RDO research. Our key finding is that the existing learned RDO scheme can only produce a measure that indicates the local “goodness” of coding decisions. We therefore further discuss the opportunities for learning a dependent measure that is more optimal in the long run. Finally, as learning an accurate measure for the full decision space could be extremely challenging, taking the High Efficiency Video Coding (HEVC) intra coding as a case study, we experimentally identify that the prediction decision accounts for the majority of the dependent optimization gain and is of the utmost value to be learned, paving the way for future research on dependent learned RDO.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 9","pages":"9414-9427"},"PeriodicalIF":11.1000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10942437/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Recent advancements in learned rate-distortion optimization (RDO) showcase that by making the intra coding decisions based on a learned measure, the encoding can be significantly accelerated without incurring much coding loss. Despite great progress in complexity reduction, the dependency issue has been largely neglected in the current learned RDO research. In this study, aiming to tap the full potential of dependent learned RDO, we first derive a probabilistic RDO framework for theoretical analysis, under which the classic and the learned RDO problems are equivalent to the maximum a posteriori (MAP) inference and the distribution imitation, respectively. Subsequently, we probabilistically revisit dependency considerations in the intra RDO research. Our key finding is that the existing learned RDO scheme can only produce a measure that indicates the local “goodness” of coding decisions. We therefore further discuss the opportunities for learning a dependent measure that is more optimal in the long run. Finally, as learning an accurate measure for the full decision space could be extremely challenging, taking the High Efficiency Video Coding (HEVC) intra coding as a case study, we experimentally identify that the prediction decision accounts for the majority of the dependent optimization gain and is of the utmost value to be learned, paving the way for future research on dependent learned RDO.
期刊介绍:
The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.