AV1编码器优化的多通道编码模式搜索框架

2019 Data Compression Conference (DCC) Pub Date : 2019-03-26 DOI:10.1109/DCC.2019.00054

Ching-Han Chiang, Jingning Han, Yaowu Xu

{"title":"AV1编码器优化的多通道编码模式搜索框架","authors":"Ching-Han Chiang, Jingning Han, Yaowu Xu","doi":"10.1109/DCC.2019.00054","DOIUrl":null,"url":null,"abstract":"The AV1 codec recently released by the Alliance of Open Media provides nearly 30% BDrate reduction over its predecessor VP9. It substantially extends the available coding block sizes and supports a wide range of prediction modes. There are also a large variety of transform kernel types and sizes. The combination provides an extremely wide range of flexible coding options. To translate such flexibility into compression efficiency, the encoder needs to conduct an extensive search over the space of coding modes. Optimization of the encoder complexity and compression efficiency trade-off is critical to productionizing AV1. Many research efforts have been devoted to devising feature space based pruning methods ranging from decision rules based on some simple observations to more complex neural network models. A multi-pass coding mode search framework is proposed in this work to provide a structural approach to reduce the search volume. It decomposes the original high dimensional space search into cascaded stages of lower dimensional space searches. To retain a near optimal search result, the scheme departs from conventional dimension reduction approach in which one retains a single winner at each stage, and uses that winner for the next stage (dimension). Instead, this framework retains a subset of the states that are the most likely winners at each stage, which are then fed into the next stage to find the next subset of winners. The subset size at each stage is determined by the likelihood that the optimal route will be captured in the current stage. Changing this likelihood parameter tunes the encoder for speed and compression performance trade-off. This framework can integrate with most existing feature based methods at its various stages. The framework provides 60% encoding time reduction at the expense of 0.6% compression loss in libaom AV1 encoder.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A Multi-Pass Coding Mode Search Framework For AV1 Encoder Optimization\",\"authors\":\"Ching-Han Chiang, Jingning Han, Yaowu Xu\",\"doi\":\"10.1109/DCC.2019.00054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The AV1 codec recently released by the Alliance of Open Media provides nearly 30% BDrate reduction over its predecessor VP9. It substantially extends the available coding block sizes and supports a wide range of prediction modes. There are also a large variety of transform kernel types and sizes. The combination provides an extremely wide range of flexible coding options. To translate such flexibility into compression efficiency, the encoder needs to conduct an extensive search over the space of coding modes. Optimization of the encoder complexity and compression efficiency trade-off is critical to productionizing AV1. Many research efforts have been devoted to devising feature space based pruning methods ranging from decision rules based on some simple observations to more complex neural network models. A multi-pass coding mode search framework is proposed in this work to provide a structural approach to reduce the search volume. It decomposes the original high dimensional space search into cascaded stages of lower dimensional space searches. To retain a near optimal search result, the scheme departs from conventional dimension reduction approach in which one retains a single winner at each stage, and uses that winner for the next stage (dimension). Instead, this framework retains a subset of the states that are the most likely winners at each stage, which are then fed into the next stage to find the next subset of winners. The subset size at each stage is determined by the likelihood that the optimal route will be captured in the current stage. Changing this likelihood parameter tunes the encoder for speed and compression performance trade-off. This framework can integrate with most existing feature based methods at its various stages. The framework provides 60% encoding time reduction at the expense of 0.6% compression loss in libaom AV1 encoder.\",\"PeriodicalId\":167723,\"journal\":{\"name\":\"2019 Data Compression Conference (DCC)\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 Data Compression Conference (DCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DCC.2019.00054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Data Compression Conference (DCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.2019.00054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

开放媒体联盟(Alliance of Open Media)最近发布的AV1编解码器比其前身VP9降低了近30%的帧率。它大大扩展了可用的编码块大小，并支持广泛的预测模式。还有各种各样的转换内核类型和大小。这种组合提供了非常广泛的灵活编码选项。为了将这种灵活性转化为压缩效率，编码器需要在编码模式的空间中进行广泛的搜索。优化编码器的复杂性和压缩效率的权衡是AV1生产的关键。许多研究工作致力于设计基于特征空间的修剪方法，从基于一些简单观察的决策规则到更复杂的神经网络模型。本文提出了一种多通道编码模式搜索框架，为减少搜索量提供了一种结构化的方法。它将原来的高维空间搜索分解为低维空间搜索的级联阶段。为了保持接近最优的搜索结果，该方案与传统的降维方法不同，传统的降维方法在每个阶段保留一个赢家，并将该赢家用于下一阶段(维度)。相反，该框架保留了每个阶段最有可能获胜的状态子集，然后将其输入到下一个阶段，以找到下一个赢家子集。每个阶段的子集大小由当前阶段捕获最优路径的可能性决定。改变这个似然参数调整编码器的速度和压缩性能的权衡。该框架可以在其各个阶段与大多数现有的基于特征的方法集成。该框架在libbaom AV1编码器中以0.6%的压缩损失为代价减少了60%的编码时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Multi-Pass Coding Mode Search Framework For AV1 Encoder Optimization

The AV1 codec recently released by the Alliance of Open Media provides nearly 30% BDrate reduction over its predecessor VP9. It substantially extends the available coding block sizes and supports a wide range of prediction modes. There are also a large variety of transform kernel types and sizes. The combination provides an extremely wide range of flexible coding options. To translate such flexibility into compression efficiency, the encoder needs to conduct an extensive search over the space of coding modes. Optimization of the encoder complexity and compression efficiency trade-off is critical to productionizing AV1. Many research efforts have been devoted to devising feature space based pruning methods ranging from decision rules based on some simple observations to more complex neural network models. A multi-pass coding mode search framework is proposed in this work to provide a structural approach to reduce the search volume. It decomposes the original high dimensional space search into cascaded stages of lower dimensional space searches. To retain a near optimal search result, the scheme departs from conventional dimension reduction approach in which one retains a single winner at each stage, and uses that winner for the next stage (dimension). Instead, this framework retains a subset of the states that are the most likely winners at each stage, which are then fed into the next stage to find the next subset of winners. The subset size at each stage is determined by the likelihood that the optimal route will be captured in the current stage. Changing this likelihood parameter tunes the encoder for speed and compression performance trade-off. This framework can integrate with most existing feature based methods at its various stages. The framework provides 60% encoding time reduction at the expense of 0.6% compression loss in libaom AV1 encoder.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 Data Compression Conference (DCC)

自引率

0.00%

发文量