基于循环神经网络的视频压缩

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA) Pub Date : 2022-12-01 DOI:10.1109/ICMLA55696.2022.00154

Zahra Montajabi, V. Ghassab, N. Bouguila

{"title":"基于循环神经网络的视频压缩","authors":"Zahra Montajabi, V. Ghassab, N. Bouguila","doi":"10.1109/ICMLA55696.2022.00154","DOIUrl":null,"url":null,"abstract":"Recently, video compression gained a large focus among computer vision problems in media technologies. Using state of the art video compression methods, videos can be transmitted in a better quality requiring less bandwidth and memory. The advent of neural network-based video compression methods remarkably promoted video coding performance. In this paper, a video compression method is presented based on Recurrent Neural Network (RNN). The method includes an encoder, a middle module, and a decoder. Binarizer is utilized in the middle module to achieve better quantization performance. In encoder and decoder modules, long short-term memory (LSTM) units are used to keep the valuable information and eliminate unnecessary ones to iteratively reduce the quality loss of reconstructed video. This method reduces the complexity of neural network-based compression schemes and encodes the videos with less quality loss. The proposed method is evaluated using peak signal-to-noise ratio (PSNR), video multimethod assessment fusion (VMAF), and structural similarity index measure (SSIM) quality metrics. The proposed method is applied to two different public video compression datasets and the results show that the method outperforms existing standard video encoding schemes such as H.264 and H.265.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Recurrent Neural Network-Based Video Compression\",\"authors\":\"Zahra Montajabi, V. Ghassab, N. Bouguila\",\"doi\":\"10.1109/ICMLA55696.2022.00154\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, video compression gained a large focus among computer vision problems in media technologies. Using state of the art video compression methods, videos can be transmitted in a better quality requiring less bandwidth and memory. The advent of neural network-based video compression methods remarkably promoted video coding performance. In this paper, a video compression method is presented based on Recurrent Neural Network (RNN). The method includes an encoder, a middle module, and a decoder. Binarizer is utilized in the middle module to achieve better quantization performance. In encoder and decoder modules, long short-term memory (LSTM) units are used to keep the valuable information and eliminate unnecessary ones to iteratively reduce the quality loss of reconstructed video. This method reduces the complexity of neural network-based compression schemes and encodes the videos with less quality loss. The proposed method is evaluated using peak signal-to-noise ratio (PSNR), video multimethod assessment fusion (VMAF), and structural similarity index measure (SSIM) quality metrics. The proposed method is applied to two different public video compression datasets and the results show that the method outperforms existing standard video encoding schemes such as H.264 and H.265.\",\"PeriodicalId\":128160,\"journal\":{\"name\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"191 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA55696.2022.00154\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA55696.2022.00154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，视频压缩在媒体技术中的计算机视觉问题中得到了广泛关注。使用最先进的视频压缩方法，视频可以以更好的质量传输，需要更少的带宽和内存。基于神经网络的视频压缩方法的出现极大地提高了视频编码的性能。提出了一种基于循环神经网络(RNN)的视频压缩方法。该方法包括编码器、中间模块和解码器。中间模块采用二值化器实现更好的量化性能。在编解码器模块中，采用LSTM (long short-term memory)单元保留有价值的信息，剔除不必要的信息，迭代降低重构视频的质量损失。该方法降低了基于神经网络的压缩方案的复杂度，编码的视频质量损失较小。采用峰值信噪比(PSNR)、视频多方法评估融合(VMAF)和结构相似指数度量(SSIM)质量指标对该方法进行了评估。将该方法应用于两个不同的公共视频压缩数据集，结果表明该方法优于现有的标准视频编码方案，如H.264和H.265。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Recurrent Neural Network-Based Video Compression

Recently, video compression gained a large focus among computer vision problems in media technologies. Using state of the art video compression methods, videos can be transmitted in a better quality requiring less bandwidth and memory. The advent of neural network-based video compression methods remarkably promoted video coding performance. In this paper, a video compression method is presented based on Recurrent Neural Network (RNN). The method includes an encoder, a middle module, and a decoder. Binarizer is utilized in the middle module to achieve better quantization performance. In encoder and decoder modules, long short-term memory (LSTM) units are used to keep the valuable information and eliminate unnecessary ones to iteratively reduce the quality loss of reconstructed video. This method reduces the complexity of neural network-based compression schemes and encodes the videos with less quality loss. The proposed method is evaluated using peak signal-to-noise ratio (PSNR), video multimethod assessment fusion (VMAF), and structural similarity index measure (SSIM) quality metrics. The proposed method is applied to two different public video compression datasets and the results show that the method outperforms existing standard video encoding schemes such as H.264 and H.265.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)

自引率

0.00%

发文量