Low complexity encoder for feedback-channel-free distributed video coding using deep convolutional neural networks at the decoder

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing Pub Date : 2016-12-18 DOI:10.1145/3009977.3009986

Pudi Raj Bhagath, J. Mukherjee, Sudipta Mukopadhayay

{"title":"Low complexity encoder for feedback-channel-free distributed video coding using deep convolutional neural networks at the decoder","authors":"Pudi Raj Bhagath, J. Mukherjee, Sudipta Mukopadhayay","doi":"10.1145/3009977.3009986","DOIUrl":null,"url":null,"abstract":"We propose a very low complexity encoder for feedback-channel-free distributed video coding (DVC) applications using deep convolutional neural network (CNN) at the decoder side. Deep CNN on super resolution uses low resolution (LR) images with 25% pixels information of high resolution (HR) image to super resolve it by the factor 2. Instead we train the network with 50% of noisy Wyner-Ziv (WZ) pixels to get full original WZ frame. So at the decoder, deep CNN reconstructs the original WZ image from 50% noisy WZ pixels. These noisy samples are obtained from the iterative algorithm called DLRTex. At the encoder side we compute local rank transform (LRT) of WZ frames for alternate pixels instead of all to reduce bit rate and complexity. These local rank transformed values are merged and their rank positions in the WZ frame are entropy coded using MQ-coder. In addition, average intensity values of each block of WZ frame are also transmitted to assist motion estimation. At the decoder, side information (SI) is generated by implementing motion estimation and compensation in LRT domain. The DLRTex algorithm is executed on SI using LRT to get the 50% noisy WZ pixels which are used in reconstructing full WZ frame. We compare our results with pixel domain DVC approaches and show that the coding efficiency of our codec is better than pixel domain distributed video coders based on low-density parity check and accumulate (LDPCA) or turbo codes. We also derive the complexity of our encoder interms of number of operations and prove that its complexity is very less compared to the LDPCA based methods.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"15 4 1","pages":"44:1-44:7"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3009977.3009986","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

We propose a very low complexity encoder for feedback-channel-free distributed video coding (DVC) applications using deep convolutional neural network (CNN) at the decoder side. Deep CNN on super resolution uses low resolution (LR) images with 25% pixels information of high resolution (HR) image to super resolve it by the factor 2. Instead we train the network with 50% of noisy Wyner-Ziv (WZ) pixels to get full original WZ frame. So at the decoder, deep CNN reconstructs the original WZ image from 50% noisy WZ pixels. These noisy samples are obtained from the iterative algorithm called DLRTex. At the encoder side we compute local rank transform (LRT) of WZ frames for alternate pixels instead of all to reduce bit rate and complexity. These local rank transformed values are merged and their rank positions in the WZ frame are entropy coded using MQ-coder. In addition, average intensity values of each block of WZ frame are also transmitted to assist motion estimation. At the decoder, side information (SI) is generated by implementing motion estimation and compensation in LRT domain. The DLRTex algorithm is executed on SI using LRT to get the 50% noisy WZ pixels which are used in reconstructing full WZ frame. We compare our results with pixel domain DVC approaches and show that the coding efficiency of our codec is better than pixel domain distributed video coders based on low-density parity check and accumulate (LDPCA) or turbo codes. We also derive the complexity of our encoder interms of number of operations and prove that its complexity is very less compared to the LDPCA based methods.

查看原文本刊更多论文

采用深度卷积神经网络解码器的无反馈信道分布式视频编码低复杂度编码器

我们提出了一种非常低复杂度的编码器，用于无反馈信道分布式视频编码(DVC)应用，在解码器端使用深度卷积神经网络(CNN)。超分辨率上的深度CNN使用低分辨率(LR)图像和高分辨率(HR)图像的25%像素信息，将其超分辨率提高2倍。相反，我们用50%的噪声wner - ziv (WZ)像素来训练网络，以获得完整的原始WZ帧。因此在解码器处，深度CNN从50%有噪声的WZ像素重建原始WZ图像。这些噪声样本是由迭代算法DLRTex获得的。在编码器端，我们计算WZ帧的局部秩变换(LRT)来替代所有的像素，以降低比特率和复杂性。这些局部秩变换值被合并，它们在WZ帧中的秩位置使用mq编码器进行熵编码。此外，还传输WZ帧各块的平均强度值，以辅助运动估计。在解码器中，通过在LRT域中实现运动估计和补偿来产生侧信息(SI)。利用LRT在SI上执行DLRTex算法，得到50%噪声的WZ像素，用于重建完整的WZ帧。我们将我们的结果与像素域DVC方法进行了比较，并表明我们的编解码器的编码效率优于基于低密度奇偶校验和累积(LDPCA)或turbo码的像素域分布式视频编码器。我们还推导了编码器的操作复杂度，并证明其复杂度与基于LDPCA的方法相比非常低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

自引率

0.00%

发文量