Early quadtree with nested multitype tree partitioning algorithm based on convolution neural network for the versatile video coding standard

IF 1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging Pub Date : 2024-07-01 DOI:10.1117/1.jei.33.4.043024

Bouthaina Abdallah, Sonda Ben Jdidia, Fatma Belghith, Mohamed Ali Ben Ayed, Nouri Masmoudi

{"title":"Early quadtree with nested multitype tree partitioning algorithm based on convolution neural network for the versatile video coding standard","authors":"Bouthaina Abdallah, Sonda Ben Jdidia, Fatma Belghith, Mohamed Ali Ben Ayed, Nouri Masmoudi","doi":"10.1117/1.jei.33.4.043024","DOIUrl":null,"url":null,"abstract":"The Joint Video Experts Team has recently finalized the versatile video coding (VVC) standard, which incorporates various advanced encoding tools. These tools ensure great enhancements in the coding efficiency, leading to a bitrate reduction up to 50% when compared to the previous standard, high-efficiency video coding. However, this enhancement comes at the expense of high computational complexity. Within this context, we address the new quadtree (QT) with nested multitype tree partition block in VVC for all-intra configuration. In fact, we propose a fast intra-coding unit (CU) partition algorithm using various convolution neural network (CNN) classifiers to directly predict the partition mode, skip unnecessary split modes, and early exit the partitioning process. The proposed approach first predicts the QT depth at a CU of size 64×64 by the corresponding CNN classifier. Then four CNN classifiers are applied to predict the partition decision tree at a CU of size 32×32 using multithreshold values and ignore the rate-distortion optimization process to speed up the partition coding time. Thus the developed method is implemented on the reference software VTM 16.2 and tested for different video sequences. The experimental results confirm that the proposed solution achieves an encoding time reduction of about 46% in average, reaching up to 67.3% with an acceptable increase in bitrate and an unsignificant decrease in quality.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"93 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Electronic Imaging","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1117/1.jei.33.4.043024","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

The Joint Video Experts Team has recently finalized the versatile video coding (VVC) standard, which incorporates various advanced encoding tools. These tools ensure great enhancements in the coding efficiency, leading to a bitrate reduction up to 50% when compared to the previous standard, high-efficiency video coding. However, this enhancement comes at the expense of high computational complexity. Within this context, we address the new quadtree (QT) with nested multitype tree partition block in VVC for all-intra configuration. In fact, we propose a fast intra-coding unit (CU) partition algorithm using various convolution neural network (CNN) classifiers to directly predict the partition mode, skip unnecessary split modes, and early exit the partitioning process. The proposed approach first predicts the QT depth at a CU of size 64×64 by the corresponding CNN classifier. Then four CNN classifiers are applied to predict the partition decision tree at a CU of size 32×32 using multithreshold values and ignore the rate-distortion optimization process to speed up the partition coding time. Thus the developed method is implemented on the reference software VTM 16.2 and tested for different video sequences. The experimental results confirm that the proposed solution achieves an encoding time reduction of about 46% in average, reaching up to 67.3% with an acceptable increase in bitrate and an unsignificant decrease in quality.

查看原文本刊更多论文

基于卷积神经网络的早期四叉树嵌套多类型树分区算法，适用于多功能视频编码标准

联合视频专家组最近最终确定了多功能视频编码（VVC）标准，其中包含各种先进的编码工具。这些工具大大提高了编码效率，与之前的高效视频编码标准相比，比特率最高可降低 50%。然而，这种提升是以高计算复杂性为代价的。在此背景下，我们在 VVC 中采用了新的四叉树（QT）嵌套多型树分割块，以实现全内配置。事实上，我们提出了一种快速编码单元（CU）内分区算法，利用各种卷积神经网络（CNN）分类器直接预测分区模式，跳过不必要的分割模式，并提前退出分区过程。所提出的方法首先通过相应的 CNN 分类器预测大小为 64×64 的 CU 的 QT 深度。然后，应用四个 CNN 分类器，使用多阈值预测 32×32 CU 大小的分区决策树，并忽略速率失真优化过程，以加快分区编码时间。因此，在参考软件 VTM 16.2 上实现了所开发的方法，并针对不同的视频序列进行了测试。实验结果证实，所提出的解决方案平均缩短了约 46%的编码时间，最高可达 67.3%，而且比特率的提高是可以接受的，质量也没有明显下降。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Electronic Imaging 工程技术-成像科学与照相技术

CiteScore

1.70

自引率

27.30%

发文量

341

审稿时长

4.0 months

期刊介绍： The Journal of Electronic Imaging publishes peer-reviewed papers in all technology areas that make up the field of electronic imaging and are normally considered in the design, engineering, and applications of electronic imaging systems.