Intra Picture Prediction for Video Coding with Neural Networks

Philipp Helle, Jonathan Pfaff, Michael Schäfer, R. Rischke, H. Schwarz, D. Marpe, T. Wiegand
{"title":"Intra Picture Prediction for Video Coding with Neural Networks","authors":"Philipp Helle, Jonathan Pfaff, Michael Schäfer, R. Rischke, H. Schwarz, D. Marpe, T. Wiegand","doi":"10.1109/DCC.2019.00053","DOIUrl":null,"url":null,"abstract":"We train a neural network to perform intra picture prediction for block based video coding. Our network has multiple prediction modes which co-adapt during training to minimize a loss function. By applying the l1-norm and a sigmoid-function to the prediction residual in the DCT domain, our loss function reflects properties of the residual quantization and coding stages present in the typical hybrid video coding architecture. We simplify the resulting predictors by pruning them in the frequency domain, thus greatly reducing the number of multiplications otherwise needed for the dense matrix-vector multiplications. Also, by quantizing the network weights and using fixed point arithmetic, we allow for a hardware friendly implementation. We demonstrate significant coding gains over state of the art intra prediction.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"149 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Data Compression Conference (DCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.2019.00053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

Abstract

We train a neural network to perform intra picture prediction for block based video coding. Our network has multiple prediction modes which co-adapt during training to minimize a loss function. By applying the l1-norm and a sigmoid-function to the prediction residual in the DCT domain, our loss function reflects properties of the residual quantization and coding stages present in the typical hybrid video coding architecture. We simplify the resulting predictors by pruning them in the frequency domain, thus greatly reducing the number of multiplications otherwise needed for the dense matrix-vector multiplications. Also, by quantizing the network weights and using fixed point arithmetic, we allow for a hardware friendly implementation. We demonstrate significant coding gains over state of the art intra prediction.
基于神经网络的视频编码图像内预测
我们训练了一个神经网络来执行基于块的视频编码的图像内预测。我们的网络具有多个预测模式,这些模式在训练过程中相互适应以最小化损失函数。通过对DCT域中的预测残差应用l1范数和s型函数,我们的损失函数反映了典型混合视频编码架构中残差量化和编码阶段的特性。我们通过在频域中修剪结果预测器来简化它们,从而大大减少了密集矩阵-向量乘法所需的乘法次数。此外,通过量化网络权重和使用定点算法,我们允许硬件友好的实现。我们展示了显著的编码增益优于最先进的内部预测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信