基于频率匹配和概率模型的 AV1 快速变换核选择

IF 3.2 1区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Zhijian Hao;Heming Sun;Guohao Xu;Jiaming Liu;Xiankui Xiong;Xuanpeng Zhu;Xiaoyang Zeng;Yibo Fan
{"title":"基于频率匹配和概率模型的 AV1 快速变换核选择","authors":"Zhijian Hao;Heming Sun;Guohao Xu;Jiaming Liu;Xiankui Xiong;Xuanpeng Zhu;Xiaoyang Zeng;Yibo Fan","doi":"10.1109/TBC.2024.3374078","DOIUrl":null,"url":null,"abstract":"As a fundamental component of video coding, transform coding concentrates the energy scattered in the spatial domain onto the upper-left region of the frequency domain. This concentration contributes significantly to Rate-Distortion performance improvement when combined with quantization and entropy coding. To better adapt the dynamic characteristics of image content, Alliance for Open Media Video 1 (AV1) introduces multiple transform kernels, which brings substantial coding performance benefits, albeit at the cost of considerably computational complexity. In this paper, we propose a fast transform kernel selection algorithm for AV1 based on frequency matching and probability model to effectively accelerate the coding process with an acceptable level of performance loss. Firstly, the concept of Frequency Matching Factor (FMF) based on cosine similarity is defined for the first time to describe the similarity between the residual block and the primary frequency basis image of the transform kernel. Statistical results demonstrate a clear distribution relationship between FMFs and normalized Rate-Distortion optimization costs (nRDOC). Then, leveraging these distribution characteristics, we establish Gaussian normal probability model of nRDOC for each FMF by characterizing the parameters of the normal model as functions of FMFs, enhancing the normal model’s accuracy and coding performance. Finally, based on the derived normal models, we design a fast selection algorithm with scalability and hardware-friendliness to skip the non-promising transform kernels. Experimental results show that the performance loss of the proposed fast algorithm is 1.15% when 57.66% of the transform kernels are skipped, resulting in a saving of 20.09% encoding time, which is superior to other fast algorithms found in the literature and competitive with the pruning algorithm based on the neural network in the AV1 reference software.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"693-707"},"PeriodicalIF":3.2000,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fast Transform Kernel Selection Based on Frequency Matching and Probability Model for AV1\",\"authors\":\"Zhijian Hao;Heming Sun;Guohao Xu;Jiaming Liu;Xiankui Xiong;Xuanpeng Zhu;Xiaoyang Zeng;Yibo Fan\",\"doi\":\"10.1109/TBC.2024.3374078\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a fundamental component of video coding, transform coding concentrates the energy scattered in the spatial domain onto the upper-left region of the frequency domain. This concentration contributes significantly to Rate-Distortion performance improvement when combined with quantization and entropy coding. To better adapt the dynamic characteristics of image content, Alliance for Open Media Video 1 (AV1) introduces multiple transform kernels, which brings substantial coding performance benefits, albeit at the cost of considerably computational complexity. In this paper, we propose a fast transform kernel selection algorithm for AV1 based on frequency matching and probability model to effectively accelerate the coding process with an acceptable level of performance loss. Firstly, the concept of Frequency Matching Factor (FMF) based on cosine similarity is defined for the first time to describe the similarity between the residual block and the primary frequency basis image of the transform kernel. Statistical results demonstrate a clear distribution relationship between FMFs and normalized Rate-Distortion optimization costs (nRDOC). Then, leveraging these distribution characteristics, we establish Gaussian normal probability model of nRDOC for each FMF by characterizing the parameters of the normal model as functions of FMFs, enhancing the normal model’s accuracy and coding performance. Finally, based on the derived normal models, we design a fast selection algorithm with scalability and hardware-friendliness to skip the non-promising transform kernels. Experimental results show that the performance loss of the proposed fast algorithm is 1.15% when 57.66% of the transform kernels are skipped, resulting in a saving of 20.09% encoding time, which is superior to other fast algorithms found in the literature and competitive with the pruning algorithm based on the neural network in the AV1 reference software.\",\"PeriodicalId\":13159,\"journal\":{\"name\":\"IEEE Transactions on Broadcasting\",\"volume\":\"70 2\",\"pages\":\"693-707\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Broadcasting\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10479536/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Broadcasting","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10479536/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

作为视频编码的基本组成部分,变换编码将空间域中分散的能量集中到频域的左上方区域。当与量化和熵编码相结合时,这种集中能极大地改善速率-失真性能。为了更好地适应图像内容的动态特性,开放媒体视频联盟 1(AV1)引入了多重变换内核,这带来了巨大的编码性能优势,尽管代价是相当高的计算复杂度。本文提出了一种基于频率匹配和概率模型的 AV1 快速变换内核选择算法,以在可接受的性能损失水平上有效加速编码过程。首先,本文首次定义了基于余弦相似性的频率匹配系数(FMF)概念,用以描述残差块与变换核的主频基图像之间的相似性。统计结果表明,FMF 与归一化率失真优化成本 (nRDOC) 之间存在明显的分布关系。然后,我们利用这些分布特征,通过将正态模型的参数表征为 FMF 的函数,为每个 FMF 建立了 nRDOC 的高斯正态概率模型,从而提高了正态模型的准确性和编码性能。最后,基于推导出的正则模型,我们设计了一种具有可扩展性和硬件友好性的快速选择算法,以跳过不具潜力的变换内核。实验结果表明,当跳过 57.66% 的变换核时,所提出的快速算法的性能损失为 1.15%,从而节省了 20.09% 的编码时间,优于文献中发现的其他快速算法,与 AV1 参考软件中基于神经网络的剪枝算法相比也具有竞争力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Fast Transform Kernel Selection Based on Frequency Matching and Probability Model for AV1
As a fundamental component of video coding, transform coding concentrates the energy scattered in the spatial domain onto the upper-left region of the frequency domain. This concentration contributes significantly to Rate-Distortion performance improvement when combined with quantization and entropy coding. To better adapt the dynamic characteristics of image content, Alliance for Open Media Video 1 (AV1) introduces multiple transform kernels, which brings substantial coding performance benefits, albeit at the cost of considerably computational complexity. In this paper, we propose a fast transform kernel selection algorithm for AV1 based on frequency matching and probability model to effectively accelerate the coding process with an acceptable level of performance loss. Firstly, the concept of Frequency Matching Factor (FMF) based on cosine similarity is defined for the first time to describe the similarity between the residual block and the primary frequency basis image of the transform kernel. Statistical results demonstrate a clear distribution relationship between FMFs and normalized Rate-Distortion optimization costs (nRDOC). Then, leveraging these distribution characteristics, we establish Gaussian normal probability model of nRDOC for each FMF by characterizing the parameters of the normal model as functions of FMFs, enhancing the normal model’s accuracy and coding performance. Finally, based on the derived normal models, we design a fast selection algorithm with scalability and hardware-friendliness to skip the non-promising transform kernels. Experimental results show that the performance loss of the proposed fast algorithm is 1.15% when 57.66% of the transform kernels are skipped, resulting in a saving of 20.09% encoding time, which is superior to other fast algorithms found in the literature and competitive with the pruning algorithm based on the neural network in the AV1 reference software.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Broadcasting
IEEE Transactions on Broadcasting 工程技术-电信学
CiteScore
9.40
自引率
31.10%
发文量
79
审稿时长
6-12 weeks
期刊介绍: The Society’s Field of Interest is “Devices, equipment, techniques and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.” In addition to this formal FOI statement, which is used to provide guidance to the Publications Committee in the selection of content, the AdCom has further resolved that “broadcast systems includes all aspects of transmission, propagation, and reception.”
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信