基于机器学习的显著性检测及其在无线多媒体通信中的视频解码应用

Applications of Machine Learning in Wireless Communications Pub Date : 2019-06-19 DOI:10.1049/PBTE081E_CH9

Mai Xu, Lai Jiang, Zhiguo Ding

{"title":"基于机器学习的显著性检测及其在无线多媒体通信中的视频解码应用","authors":"Mai Xu, Lai Jiang, Zhiguo Ding","doi":"10.1049/PBTE081E_CH9","DOIUrl":null,"url":null,"abstract":"Saliency detection has been widely studied to predict human fixations, with various applications in wireless multimedia communications. For saliency detection, we argue that the state-of-the-art high-efficiency video-coding (HEVC) standard can be used to generate the useful features in compressed domain. Therefore, this chapter proposes to learn the video-saliency model, with regard to HEVC features. First, we establish an eye-tracking database for video-saliency detection. Through the statistical analysis on our eye-tracking database, we find out that human fixations tend to fall into the regions with large-valued HEVC features on splitting depth, bit allocation, and motion vector (MV). In addition, three observations are obtained from the further analysis on our eyetracking database. Accordingly, several features in HEVC domain are proposed on the basis of splitting depth, bit allocation, and MV. Next, a support vector machine (SVM) is learned to integrate those HEVC features together, for video-saliency detection. Since almost all video data are stored in the compressed form, our method is able to avoid both the computational cost on decoding and the storage cost on raw data. More importantly, experimental results show that the proposed method is superior to other state-of-the-art saliency-detection methods, either in compressed or uncompressed domain.","PeriodicalId":358911,"journal":{"name":"Applications of Machine Learning in Wireless Communications","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine-learning-based saliency detection and its video decoding application in wireless multimedia communications\",\"authors\":\"Mai Xu, Lai Jiang, Zhiguo Ding\",\"doi\":\"10.1049/PBTE081E_CH9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Saliency detection has been widely studied to predict human fixations, with various applications in wireless multimedia communications. For saliency detection, we argue that the state-of-the-art high-efficiency video-coding (HEVC) standard can be used to generate the useful features in compressed domain. Therefore, this chapter proposes to learn the video-saliency model, with regard to HEVC features. First, we establish an eye-tracking database for video-saliency detection. Through the statistical analysis on our eye-tracking database, we find out that human fixations tend to fall into the regions with large-valued HEVC features on splitting depth, bit allocation, and motion vector (MV). In addition, three observations are obtained from the further analysis on our eyetracking database. Accordingly, several features in HEVC domain are proposed on the basis of splitting depth, bit allocation, and MV. Next, a support vector machine (SVM) is learned to integrate those HEVC features together, for video-saliency detection. Since almost all video data are stored in the compressed form, our method is able to avoid both the computational cost on decoding and the storage cost on raw data. More importantly, experimental results show that the proposed method is superior to other state-of-the-art saliency-detection methods, either in compressed or uncompressed domain.\",\"PeriodicalId\":358911,\"journal\":{\"name\":\"Applications of Machine Learning in Wireless Communications\",\"volume\":\"58 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applications of Machine Learning in Wireless Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1049/PBTE081E_CH9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applications of Machine Learning in Wireless Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/PBTE081E_CH9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在无线多媒体通信中，显著性检测被广泛用于预测人的注视。对于显著性检测，我们认为可以使用最先进的高效视频编码(HEVC)标准来生成压缩域中有用的特征。因此，本章提出学习视频显著性模型，针对HEVC的特征。首先，我们建立了用于视频显著性检测的眼动追踪数据库。通过对我们的眼动追踪数据库的统计分析，我们发现人类的注视在分割深度、位分配和运动矢量(MV)上倾向于落入HEVC特征值较大的区域。此外，通过对我们的眼动追踪数据库的进一步分析，得到了三个观察结果。据此，在分割深度、位分配和MV的基础上，提出了HEVC域的几种特征。接下来，学习支持向量机(SVM)将这些HEVC特征整合在一起，用于视频显著性检测。由于几乎所有视频数据都以压缩形式存储，因此我们的方法既可以避免解码的计算成本，也可以避免原始数据的存储成本。更重要的是，实验结果表明，无论在压缩域还是非压缩域，该方法都优于其他最先进的显著性检测方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Machine-learning-based saliency detection and its video decoding application in wireless multimedia communications

Saliency detection has been widely studied to predict human fixations, with various applications in wireless multimedia communications. For saliency detection, we argue that the state-of-the-art high-efficiency video-coding (HEVC) standard can be used to generate the useful features in compressed domain. Therefore, this chapter proposes to learn the video-saliency model, with regard to HEVC features. First, we establish an eye-tracking database for video-saliency detection. Through the statistical analysis on our eye-tracking database, we find out that human fixations tend to fall into the regions with large-valued HEVC features on splitting depth, bit allocation, and motion vector (MV). In addition, three observations are obtained from the further analysis on our eyetracking database. Accordingly, several features in HEVC domain are proposed on the basis of splitting depth, bit allocation, and MV. Next, a support vector machine (SVM) is learned to integrate those HEVC features together, for video-saliency detection. Since almost all video data are stored in the compressed form, our method is able to avoid both the computational cost on decoding and the storage cost on raw data. More importantly, experimental results show that the proposed method is superior to other state-of-the-art saliency-detection methods, either in compressed or uncompressed domain.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applications of Machine Learning in Wireless Communications

自引率

0.00%

发文量