基于自适应增强全局内预测的超VVC高效视频编码

IF 11.1 1区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Circuits and Systems for Video Technology Pub Date : 2025-01-29 DOI:10.1109/TCSVT.2025.3535951

Junyan Huo;Yanzhuo Ma;Zhenyao Zhang;Hongli Zhang;Hui Yuan;Shuai Wan;Fuzheng Yang

{"title":"基于自适应增强全局内预测的超VVC高效视频编码","authors":"Junyan Huo;Yanzhuo Ma;Zhenyao Zhang;Hongli Zhang;Hui Yuan;Shuai Wan;Fuzheng Yang","doi":"10.1109/TCSVT.2025.3535951","DOIUrl":null,"url":null,"abstract":"Global intra prediction (GIP), including intra-block copy and template matching prediction (TMP), exploits the global correlation of the same image to improve the coding efficiency. In Beyond VVC, TMP uses template matching to determine the reference blocks for efficient prediction. There usually exists an error between the coding block and reference blocks, caused by the content mismatch or the coding distortion of the reference blocks. We propose an enhancement over the reference blocks, namely enhanced GIP (EGIP). Specifically, we design an enhanced filter according to the templates of the coding block and the reference blocks, with the reconstructed template of the coding block as the label for supervised learning. To support different enhancements, we design two types of inputs, i.e., EGIP based on neighboring samples (N-EGIP) and EGIP based on multiple hypothesis references (M-EGIP). Experimental results show that, based on enhanced compression model (ECM) version 8.0, N-EGIP achieves BD-rate reductions of 0.37%, 0.42%, and 0.40%, and M-EGIP brings 0.34%, 0.37%, and 0.34% BD-rate savings for Y, Cb, and Cr components, respectively. A higher coding gain, 0.46%, 0.54%, and 0.52% BD-rate savings, can be achieved by integrating N-EGIP and M-EGIP together. Owing to the coding gain and small complexity increase, the proposed EGIP has been adopted in the exploration of Beyond VVC and integrated into its reference software.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 6","pages":"6145-6157"},"PeriodicalIF":11.1000,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive Enhanced Global Intra Prediction for Efficient Video Coding in Beyond VVC\",\"authors\":\"Junyan Huo;Yanzhuo Ma;Zhenyao Zhang;Hongli Zhang;Hui Yuan;Shuai Wan;Fuzheng Yang\",\"doi\":\"10.1109/TCSVT.2025.3535951\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Global intra prediction (GIP), including intra-block copy and template matching prediction (TMP), exploits the global correlation of the same image to improve the coding efficiency. In Beyond VVC, TMP uses template matching to determine the reference blocks for efficient prediction. There usually exists an error between the coding block and reference blocks, caused by the content mismatch or the coding distortion of the reference blocks. We propose an enhancement over the reference blocks, namely enhanced GIP (EGIP). Specifically, we design an enhanced filter according to the templates of the coding block and the reference blocks, with the reconstructed template of the coding block as the label for supervised learning. To support different enhancements, we design two types of inputs, i.e., EGIP based on neighboring samples (N-EGIP) and EGIP based on multiple hypothesis references (M-EGIP). Experimental results show that, based on enhanced compression model (ECM) version 8.0, N-EGIP achieves BD-rate reductions of 0.37%, 0.42%, and 0.40%, and M-EGIP brings 0.34%, 0.37%, and 0.34% BD-rate savings for Y, Cb, and Cr components, respectively. A higher coding gain, 0.46%, 0.54%, and 0.52% BD-rate savings, can be achieved by integrating N-EGIP and M-EGIP together. Owing to the coding gain and small complexity increase, the proposed EGIP has been adopted in the exploration of Beyond VVC and integrated into its reference software.\",\"PeriodicalId\":13082,\"journal\":{\"name\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"volume\":\"35 6\",\"pages\":\"6145-6157\"},\"PeriodicalIF\":11.1000,\"publicationDate\":\"2025-01-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10857454/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10857454/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

全局内预测（GIP）利用同幅图像的全局相关性来提高编码效率，包括块内复制和模板匹配预测（TMP）。在Beyond VVC中，TMP使用模板匹配来确定有效预测的参考块。通常在编码块和参考块之间存在错误，这是由于内容不匹配或参考块的编码失真造成的。我们提出了一种对参考块的增强，即增强GIP （EGIP）。具体来说，我们根据编码块和参考块的模板设计了一个增强滤波器，以编码块的重构模板作为监督学习的标签。为了支持不同的增强，我们设计了两种类型的输入，即基于邻近样本的EGIP （N-EGIP）和基于多假设参考的EGIP （M-EGIP）。实验结果表明，基于ECM 8.0版本，N-EGIP对Y、Cb和Cr组分的bd率分别降低了0.37%、0.42%和0.40%，M-EGIP对Y、Cb和Cr组分的bd率分别降低了0.34%、0.37%和0.34%。通过将N-EGIP和M-EGIP集成在一起，可以实现更高的编码增益，分别节省0.46%、0.54%和0.52%的bd速率。由于编码增益和复杂度增加小，本文提出的EGIP已被用于Beyond VVC的探索，并集成到其参考软件中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Adaptive Enhanced Global Intra Prediction for Efficient Video Coding in Beyond VVC

Global intra prediction (GIP), including intra-block copy and template matching prediction (TMP), exploits the global correlation of the same image to improve the coding efficiency. In Beyond VVC, TMP uses template matching to determine the reference blocks for efficient prediction. There usually exists an error between the coding block and reference blocks, caused by the content mismatch or the coding distortion of the reference blocks. We propose an enhancement over the reference blocks, namely enhanced GIP (EGIP). Specifically, we design an enhanced filter according to the templates of the coding block and the reference blocks, with the reconstructed template of the coding block as the label for supervised learning. To support different enhancements, we design two types of inputs, i.e., EGIP based on neighboring samples (N-EGIP) and EGIP based on multiple hypothesis references (M-EGIP). Experimental results show that, based on enhanced compression model (ECM) version 8.0, N-EGIP achieves BD-rate reductions of 0.37%, 0.42%, and 0.40%, and M-EGIP brings 0.34%, 0.37%, and 0.34% BD-rate savings for Y, Cb, and Cr components, respectively. A higher coding gain, 0.46%, 0.54%, and 0.52% BD-rate savings, can be achieved by integrating N-EGIP and M-EGIP together. Owing to the coding gain and small complexity increase, the proposed EGIP has been adopted in the exploration of Beyond VVC and integrated into its reference software.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Circuits and Systems for Video Technology 工程技术-工程：电子与电气

CiteScore

13.80

自引率

27.40%

发文量

660

审稿时长

5 months

期刊介绍： The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.