Zhijian Hao;Heming Sun;Guohao Xu;Jiaming Liu;Xiankui Xiong;Xuanpeng Zhu;Xiaoyang Zeng;Yibo Fan
{"title":"Fast Transform Kernel Selection Based on Frequency Matching and Probability Model for AV1","authors":"Zhijian Hao;Heming Sun;Guohao Xu;Jiaming Liu;Xiankui Xiong;Xuanpeng Zhu;Xiaoyang Zeng;Yibo Fan","doi":"10.1109/TBC.2024.3374078","DOIUrl":null,"url":null,"abstract":"As a fundamental component of video coding, transform coding concentrates the energy scattered in the spatial domain onto the upper-left region of the frequency domain. This concentration contributes significantly to Rate-Distortion performance improvement when combined with quantization and entropy coding. To better adapt the dynamic characteristics of image content, Alliance for Open Media Video 1 (AV1) introduces multiple transform kernels, which brings substantial coding performance benefits, albeit at the cost of considerably computational complexity. In this paper, we propose a fast transform kernel selection algorithm for AV1 based on frequency matching and probability model to effectively accelerate the coding process with an acceptable level of performance loss. Firstly, the concept of Frequency Matching Factor (FMF) based on cosine similarity is defined for the first time to describe the similarity between the residual block and the primary frequency basis image of the transform kernel. Statistical results demonstrate a clear distribution relationship between FMFs and normalized Rate-Distortion optimization costs (nRDOC). Then, leveraging these distribution characteristics, we establish Gaussian normal probability model of nRDOC for each FMF by characterizing the parameters of the normal model as functions of FMFs, enhancing the normal model’s accuracy and coding performance. Finally, based on the derived normal models, we design a fast selection algorithm with scalability and hardware-friendliness to skip the non-promising transform kernels. Experimental results show that the performance loss of the proposed fast algorithm is 1.15% when 57.66% of the transform kernels are skipped, resulting in a saving of 20.09% encoding time, which is superior to other fast algorithms found in the literature and competitive with the pruning algorithm based on the neural network in the AV1 reference software.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"693-707"},"PeriodicalIF":3.2000,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Broadcasting","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10479536/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
As a fundamental component of video coding, transform coding concentrates the energy scattered in the spatial domain onto the upper-left region of the frequency domain. This concentration contributes significantly to Rate-Distortion performance improvement when combined with quantization and entropy coding. To better adapt the dynamic characteristics of image content, Alliance for Open Media Video 1 (AV1) introduces multiple transform kernels, which brings substantial coding performance benefits, albeit at the cost of considerably computational complexity. In this paper, we propose a fast transform kernel selection algorithm for AV1 based on frequency matching and probability model to effectively accelerate the coding process with an acceptable level of performance loss. Firstly, the concept of Frequency Matching Factor (FMF) based on cosine similarity is defined for the first time to describe the similarity between the residual block and the primary frequency basis image of the transform kernel. Statistical results demonstrate a clear distribution relationship between FMFs and normalized Rate-Distortion optimization costs (nRDOC). Then, leveraging these distribution characteristics, we establish Gaussian normal probability model of nRDOC for each FMF by characterizing the parameters of the normal model as functions of FMFs, enhancing the normal model’s accuracy and coding performance. Finally, based on the derived normal models, we design a fast selection algorithm with scalability and hardware-friendliness to skip the non-promising transform kernels. Experimental results show that the performance loss of the proposed fast algorithm is 1.15% when 57.66% of the transform kernels are skipped, resulting in a saving of 20.09% encoding time, which is superior to other fast algorithms found in the literature and competitive with the pruning algorithm based on the neural network in the AV1 reference software.
期刊介绍:
The Society’s Field of Interest is “Devices, equipment, techniques and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.” In addition to this formal FOI statement, which is used to provide guidance to the Publications Committee in the selection of content, the AdCom has further resolved that “broadcast systems includes all aspects of transmission, propagation, and reception.”