Deep Spatial and Temporal Information based QoE Evaluation Model for HTTP Adaptive Streaming

Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence Pub Date : 2021-12-04 DOI:10.1145/3507548.3507608

L. Du, L. Zhuo, Jiafeng Li, Hui Zhang

{"title":"Deep Spatial and Temporal Information based QoE Evaluation Model for HTTP Adaptive Streaming","authors":"L. Du, L. Zhuo, Jiafeng Li, Hui Zhang","doi":"10.1145/3507548.3507608","DOIUrl":null,"url":null,"abstract":"The content characteristics of video is one of the important influencing factors affecting the user's Quality of Experience (QoE). In this paper, deep spatial and temporal information are extracted to characterize the content characteristics of video, which are then used to establish a QoE evaluation model for HTTP adaptive streaming. Firstly, a Gabor convolutional layer and Channel Attention (CA) are incorporated into ResNet18 to construct the Gabor-CA-ResNet18 network, which is used to capture the Deep Spatial Information (DSI) of video. To avoid the problem of the \"curse of dimensionality\", LargeVis is applied to reduce the dimensionality of the DSI features to improve the representative and discriminative ability, obtaining a compact feature representation vector. Secondly, 3D Convolutional Neural Networks (3D CNN) and Gated Recurrent Unit (GRU) are used together to capture the Deep Temporal Information (DTI) of video, named 3D CNN-GRU. And finally, the DSI and DTI features are combined with the statistical features of other influencing factors, including video quality level, re-buffering duration, re-buffering frequency, and so on, to form the feature parameter vector. The Gradient Boosting method is adopted to establish the mapping relationship model between the feature parameter vector and Mean Opinion Score (MOS), which can be used to predict the user's QoE. Experimental results on SQoE-III and SQoE-IV datasets demonstrate that the proposed QoE model can achieve the state-of-the-art performance compared with the existing QoE evaluation models.","PeriodicalId":414908,"journal":{"name":"Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3507548.3507608","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The content characteristics of video is one of the important influencing factors affecting the user's Quality of Experience (QoE). In this paper, deep spatial and temporal information are extracted to characterize the content characteristics of video, which are then used to establish a QoE evaluation model for HTTP adaptive streaming. Firstly, a Gabor convolutional layer and Channel Attention (CA) are incorporated into ResNet18 to construct the Gabor-CA-ResNet18 network, which is used to capture the Deep Spatial Information (DSI) of video. To avoid the problem of the "curse of dimensionality", LargeVis is applied to reduce the dimensionality of the DSI features to improve the representative and discriminative ability, obtaining a compact feature representation vector. Secondly, 3D Convolutional Neural Networks (3D CNN) and Gated Recurrent Unit (GRU) are used together to capture the Deep Temporal Information (DTI) of video, named 3D CNN-GRU. And finally, the DSI and DTI features are combined with the statistical features of other influencing factors, including video quality level, re-buffering duration, re-buffering frequency, and so on, to form the feature parameter vector. The Gradient Boosting method is adopted to establish the mapping relationship model between the feature parameter vector and Mean Opinion Score (MOS), which can be used to predict the user's QoE. Experimental results on SQoE-III and SQoE-IV datasets demonstrate that the proposed QoE model can achieve the state-of-the-art performance compared with the existing QoE evaluation models.

查看原文本刊更多论文

基于深度时空信息的HTTP自适应流QoE评价模型

视频的内容特征是影响用户体验质量的重要因素之一。本文通过提取视频的深层时空信息来表征视频的内容特征，并利用这些特征建立HTTP自适应流媒体的QoE评价模型。首先，在ResNet18中加入Gabor卷积层和通道注意(CA)，构建Gabor-CA-ResNet18网络，用于捕获视频的深度空间信息(DSI);为了避免“维数诅咒”的问题，应用LargeVis对DSI特征进行降维，提高特征的代表性和判别能力，得到一个紧凑的特征表示向量。其次，采用三维卷积神经网络(3D CNN)和门控递归单元(GRU)相结合的方法捕获视频的深度时间信息(DTI)，命名为3D CNN-GRU。最后，将DSI和DTI特征与视频质量等级、再缓冲时间、再缓冲频率等其他影响因素的统计特征相结合，形成特征参数向量。采用梯度增强方法建立特征参数向量与平均意见评分(Mean Opinion Score, MOS)之间的映射关系模型，用于预测用户的QoE。在SQoE-III和SQoE-IV数据集上的实验结果表明，与现有的QoE评估模型相比，所提出的QoE模型可以达到最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence

自引率

0.00%

发文量