基于循环神经网络的视频压缩

Zahra Montajabi, V. Ghassab, N. Bouguila
{"title":"基于循环神经网络的视频压缩","authors":"Zahra Montajabi, V. Ghassab, N. Bouguila","doi":"10.1109/ICMLA55696.2022.00154","DOIUrl":null,"url":null,"abstract":"Recently, video compression gained a large focus among computer vision problems in media technologies. Using state of the art video compression methods, videos can be transmitted in a better quality requiring less bandwidth and memory. The advent of neural network-based video compression methods remarkably promoted video coding performance. In this paper, a video compression method is presented based on Recurrent Neural Network (RNN). The method includes an encoder, a middle module, and a decoder. Binarizer is utilized in the middle module to achieve better quantization performance. In encoder and decoder modules, long short-term memory (LSTM) units are used to keep the valuable information and eliminate unnecessary ones to iteratively reduce the quality loss of reconstructed video. This method reduces the complexity of neural network-based compression schemes and encodes the videos with less quality loss. The proposed method is evaluated using peak signal-to-noise ratio (PSNR), video multimethod assessment fusion (VMAF), and structural similarity index measure (SSIM) quality metrics. The proposed method is applied to two different public video compression datasets and the results show that the method outperforms existing standard video encoding schemes such as H.264 and H.265.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Recurrent Neural Network-Based Video Compression\",\"authors\":\"Zahra Montajabi, V. Ghassab, N. Bouguila\",\"doi\":\"10.1109/ICMLA55696.2022.00154\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, video compression gained a large focus among computer vision problems in media technologies. Using state of the art video compression methods, videos can be transmitted in a better quality requiring less bandwidth and memory. The advent of neural network-based video compression methods remarkably promoted video coding performance. In this paper, a video compression method is presented based on Recurrent Neural Network (RNN). The method includes an encoder, a middle module, and a decoder. Binarizer is utilized in the middle module to achieve better quantization performance. In encoder and decoder modules, long short-term memory (LSTM) units are used to keep the valuable information and eliminate unnecessary ones to iteratively reduce the quality loss of reconstructed video. This method reduces the complexity of neural network-based compression schemes and encodes the videos with less quality loss. The proposed method is evaluated using peak signal-to-noise ratio (PSNR), video multimethod assessment fusion (VMAF), and structural similarity index measure (SSIM) quality metrics. The proposed method is applied to two different public video compression datasets and the results show that the method outperforms existing standard video encoding schemes such as H.264 and H.265.\",\"PeriodicalId\":128160,\"journal\":{\"name\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"191 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA55696.2022.00154\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA55696.2022.00154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

近年来,视频压缩在媒体技术中的计算机视觉问题中得到了广泛关注。使用最先进的视频压缩方法,视频可以以更好的质量传输,需要更少的带宽和内存。基于神经网络的视频压缩方法的出现极大地提高了视频编码的性能。提出了一种基于循环神经网络(RNN)的视频压缩方法。该方法包括编码器、中间模块和解码器。中间模块采用二值化器实现更好的量化性能。在编解码器模块中,采用LSTM (long short-term memory)单元保留有价值的信息,剔除不必要的信息,迭代降低重构视频的质量损失。该方法降低了基于神经网络的压缩方案的复杂度,编码的视频质量损失较小。采用峰值信噪比(PSNR)、视频多方法评估融合(VMAF)和结构相似指数度量(SSIM)质量指标对该方法进行了评估。将该方法应用于两个不同的公共视频压缩数据集,结果表明该方法优于现有的标准视频编码方案,如H.264和H.265。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Recurrent Neural Network-Based Video Compression
Recently, video compression gained a large focus among computer vision problems in media technologies. Using state of the art video compression methods, videos can be transmitted in a better quality requiring less bandwidth and memory. The advent of neural network-based video compression methods remarkably promoted video coding performance. In this paper, a video compression method is presented based on Recurrent Neural Network (RNN). The method includes an encoder, a middle module, and a decoder. Binarizer is utilized in the middle module to achieve better quantization performance. In encoder and decoder modules, long short-term memory (LSTM) units are used to keep the valuable information and eliminate unnecessary ones to iteratively reduce the quality loss of reconstructed video. This method reduces the complexity of neural network-based compression schemes and encodes the videos with less quality loss. The proposed method is evaluated using peak signal-to-noise ratio (PSNR), video multimethod assessment fusion (VMAF), and structural similarity index measure (SSIM) quality metrics. The proposed method is applied to two different public video compression datasets and the results show that the method outperforms existing standard video encoding schemes such as H.264 and H.265.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信