Content-Aware Dynamic In-Loop Filter With Adjustable Complexity for VVC Intra Coding

IF 11.1 1区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Circuits and Systems for Video Technology Pub Date : 2025-01-28 DOI:10.1109/TCSVT.2025.3535784

Hengyu Man;Hao Wang;Riyu Lu;Zhaolin Wan;Xiaopeng Fan;Debin Zhao

{"title":"Content-Aware Dynamic In-Loop Filter With Adjustable Complexity for VVC Intra Coding","authors":"Hengyu Man;Hao Wang;Riyu Lu;Zhaolin Wan;Xiaopeng Fan;Debin Zhao","doi":"10.1109/TCSVT.2025.3535784","DOIUrl":null,"url":null,"abstract":"Recently, neural network-based in-loop filters have been rapidly developed, effectively improving the reconstruction quality and compression efficiency in video coding. Existing deep in-loop filters typically employed networks with fixed structures to process all image blocks. However, under various bitrate conditions, compressed image blocks with different textures exhibit varying degradations, which poses a challenge for high-quality and low-complexity filtering. Additionally, different complexity requirements for coding tools in various scenarios limit the versatility of fixed models. To address these problems, a content-aware dynamic in-loop filter (dubbed DILF) with adjustable complexity is proposed in this paper. Specifically, DILF comprises a policy network and a filtering network. For each reconstructed image block, the policy network dynamically generates a filtering network topology based on pixel information and the quantization parameter (QP), guiding the filtering network to skip redundant layers and conduct content-aware image enhancement, thereby improving the filtering performance. In addition, by introducing a user-defined balancing factor into the policy network, the content-aware filtering network topology can be further adjusted according to user’s requirements, facilitating adjustable complexity with a single model. We integrate DILF into Versatile Video Coding (VVC) to replace the built-in deblocking filter. Extensive experiments demonstrate the efficiency of DILF in processing image blocks with varying degrees of degradation and its flexibility in controlling complexity. When the balancing factor is set to 2e-5, DILF achieves bitrate savings of 8.07%, 17.97%, and 20.93% on average for YUV components over VVC reference software VTM-11.0 under all-intra configuration. Compared to static networks with fixed structures, DILF demonstrates superior performance and lower computational complexity.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 6","pages":"6114-6128"},"PeriodicalIF":11.1000,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10856241/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, neural network-based in-loop filters have been rapidly developed, effectively improving the reconstruction quality and compression efficiency in video coding. Existing deep in-loop filters typically employed networks with fixed structures to process all image blocks. However, under various bitrate conditions, compressed image blocks with different textures exhibit varying degradations, which poses a challenge for high-quality and low-complexity filtering. Additionally, different complexity requirements for coding tools in various scenarios limit the versatility of fixed models. To address these problems, a content-aware dynamic in-loop filter (dubbed DILF) with adjustable complexity is proposed in this paper. Specifically, DILF comprises a policy network and a filtering network. For each reconstructed image block, the policy network dynamically generates a filtering network topology based on pixel information and the quantization parameter (QP), guiding the filtering network to skip redundant layers and conduct content-aware image enhancement, thereby improving the filtering performance. In addition, by introducing a user-defined balancing factor into the policy network, the content-aware filtering network topology can be further adjusted according to user’s requirements, facilitating adjustable complexity with a single model. We integrate DILF into Versatile Video Coding (VVC) to replace the built-in deblocking filter. Extensive experiments demonstrate the efficiency of DILF in processing image blocks with varying degrees of degradation and its flexibility in controlling complexity. When the balancing factor is set to 2e-5, DILF achieves bitrate savings of 8.07%, 17.97%, and 20.93% on average for YUV components over VVC reference software VTM-11.0 under all-intra configuration. Compared to static networks with fixed structures, DILF demonstrates superior performance and lower computational complexity.

查看原文本刊更多论文

具有可调复杂度的VVC编码内容感知动态环内滤波器

近年来，基于神经网络的环内滤波器得到了迅速发展，有效地提高了视频编码的重构质量和压缩效率。现有的深度环内滤波器通常采用固定结构的网络来处理所有图像块。然而，在不同的比特率条件下，具有不同纹理的压缩图像块表现出不同的退化，这对高质量和低复杂度的滤波提出了挑战。此外，在不同的场景中，对编码工具的不同复杂性需求限制了固定模型的通用性。为了解决这些问题，本文提出了一种复杂度可调的内容感知动态环内滤波器（DILF）。具体来说，DILF包括策略网络和过滤网络。对于每个重构图像块，策略网络根据像素信息和量化参数（QP）动态生成滤波网络拓扑，引导滤波网络跳过冗余层，对图像进行内容感知增强，从而提高滤波性能。此外，通过在策略网络中引入自定义的平衡因子，可以根据用户的需求进一步调整内容感知过滤网络的拓扑结构，从而实现单一模型的复杂度可调。我们将DILF集成到多功能视频编码（VVC）中，以取代内置的去块滤波器。大量的实验证明了DILF在处理不同程度退化的图像块方面的有效性，以及它在控制复杂度方面的灵活性。当平衡系数设置为25 -5时，在全内部配置下，与VVC参考软件VTM-11.0相比，DILF在YUV组件上平均节省了8.07%、17.97%和20.93%的比特率。与具有固定结构的静态网络相比，DILF具有更好的性能和更低的计算复杂度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Circuits and Systems for Video Technology 工程技术-工程：电子与电气

CiteScore

13.80

自引率

27.40%

发文量

660

审稿时长

5 months

期刊介绍： The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.