基于ssim的HEVC感知视频编码

A. Rehman, Zhou Wang
{"title":"基于ssim的HEVC感知视频编码","authors":"A. Rehman, Zhou Wang","doi":"10.1109/ICME.2012.175","DOIUrl":null,"url":null,"abstract":"Recent advances in video capturing and display technologies, along with the exponentially increasing demand of video services, challenge the video coding research community to design new algorithms able to significantly improve the compression performance of the current H.264/AVC standard. This target is currently gaining evidence with the standardization activities in the High Efficiency Video Coding (HEVC) project. The distortion models used in HEVC are mean squared error (MSE) and sum of absolute difference (SAD). However, they are widely criticized for not correlating well with perceptual image quality. The structural similarity (SSIM) index has been found to be a good indicator of perceived image quality. Meanwhile, it is computationally simple compared with other state-of-the-art perceptual quality measures and has a number of desirable mathematical properties for optimization tasks. We propose a perceptual video coding method to improve upon the current HEVC based on an SSIM-inspired divisive normalization scheme as an attempt to transform the DCT domain frame prediction residuals to a perceptually uniform space before encoding. Based on the residual divisive normalization process, we define a distortion model for mode selection and show that such a divisive normalization strategy largely simplifies the subsequent perceptual rate-distortion optimization procedure. We further adjust the divisive normalization factors based on local content of the video frame. Experiments show that the proposed scheme can achieve significant gain in terms of rate-SSIM performance when compared with HEVC.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"42","resultStr":"{\"title\":\"SSIM-Inspired Perceptual Video Coding for HEVC\",\"authors\":\"A. Rehman, Zhou Wang\",\"doi\":\"10.1109/ICME.2012.175\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent advances in video capturing and display technologies, along with the exponentially increasing demand of video services, challenge the video coding research community to design new algorithms able to significantly improve the compression performance of the current H.264/AVC standard. This target is currently gaining evidence with the standardization activities in the High Efficiency Video Coding (HEVC) project. The distortion models used in HEVC are mean squared error (MSE) and sum of absolute difference (SAD). However, they are widely criticized for not correlating well with perceptual image quality. The structural similarity (SSIM) index has been found to be a good indicator of perceived image quality. Meanwhile, it is computationally simple compared with other state-of-the-art perceptual quality measures and has a number of desirable mathematical properties for optimization tasks. We propose a perceptual video coding method to improve upon the current HEVC based on an SSIM-inspired divisive normalization scheme as an attempt to transform the DCT domain frame prediction residuals to a perceptually uniform space before encoding. Based on the residual divisive normalization process, we define a distortion model for mode selection and show that such a divisive normalization strategy largely simplifies the subsequent perceptual rate-distortion optimization procedure. We further adjust the divisive normalization factors based on local content of the video frame. Experiments show that the proposed scheme can achieve significant gain in terms of rate-SSIM performance when compared with HEVC.\",\"PeriodicalId\":273567,\"journal\":{\"name\":\"2012 IEEE International Conference on Multimedia and Expo\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"42\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Multimedia and Expo\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME.2012.175\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Multimedia and Expo","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2012.175","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 42

摘要

视频采集和显示技术的最新进展,以及视频服务需求的指数级增长,对视频编码研究社区提出了挑战,要求他们设计出能够显著提高当前H.264/AVC标准压缩性能的新算法。这一目标目前正在高效视频编码(HEVC)项目的标准化活动中获得证据。HEVC中使用的失真模型是均方误差(MSE)和绝对差和(SAD)。然而,它们因不能很好地与感知图像质量相关而受到广泛批评。结构相似性(SSIM)指数已被发现是感知图像质量的一个很好的指标。同时,与其他最先进的感知质量度量相比,它计算简单,并且具有许多理想的数学性质,用于优化任务。我们提出了一种基于ssim启发的分裂归一化方案改进当前HEVC的感知视频编码方法,试图在编码前将DCT域帧预测残差转换为感知均匀空间。基于残差分裂归一化过程,定义了一种用于模式选择的失真模型,并表明这种分裂归一化策略在很大程度上简化了随后的感知率失真优化过程。我们根据视频帧的局部内容进一步调整分裂归一化因子。实验表明,与HEVC相比,该方案在速率- ssim性能上有显著提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
SSIM-Inspired Perceptual Video Coding for HEVC
Recent advances in video capturing and display technologies, along with the exponentially increasing demand of video services, challenge the video coding research community to design new algorithms able to significantly improve the compression performance of the current H.264/AVC standard. This target is currently gaining evidence with the standardization activities in the High Efficiency Video Coding (HEVC) project. The distortion models used in HEVC are mean squared error (MSE) and sum of absolute difference (SAD). However, they are widely criticized for not correlating well with perceptual image quality. The structural similarity (SSIM) index has been found to be a good indicator of perceived image quality. Meanwhile, it is computationally simple compared with other state-of-the-art perceptual quality measures and has a number of desirable mathematical properties for optimization tasks. We propose a perceptual video coding method to improve upon the current HEVC based on an SSIM-inspired divisive normalization scheme as an attempt to transform the DCT domain frame prediction residuals to a perceptually uniform space before encoding. Based on the residual divisive normalization process, we define a distortion model for mode selection and show that such a divisive normalization strategy largely simplifies the subsequent perceptual rate-distortion optimization procedure. We further adjust the divisive normalization factors based on local content of the video frame. Experiments show that the proposed scheme can achieve significant gain in terms of rate-SSIM performance when compared with HEVC.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信