{"title":"基于深度时空信息的HTTP自适应流QoE评价模型","authors":"L. Du, L. Zhuo, Jiafeng Li, Hui Zhang","doi":"10.1145/3507548.3507608","DOIUrl":null,"url":null,"abstract":"The content characteristics of video is one of the important influencing factors affecting the user's Quality of Experience (QoE). In this paper, deep spatial and temporal information are extracted to characterize the content characteristics of video, which are then used to establish a QoE evaluation model for HTTP adaptive streaming. Firstly, a Gabor convolutional layer and Channel Attention (CA) are incorporated into ResNet18 to construct the Gabor-CA-ResNet18 network, which is used to capture the Deep Spatial Information (DSI) of video. To avoid the problem of the \"curse of dimensionality\", LargeVis is applied to reduce the dimensionality of the DSI features to improve the representative and discriminative ability, obtaining a compact feature representation vector. Secondly, 3D Convolutional Neural Networks (3D CNN) and Gated Recurrent Unit (GRU) are used together to capture the Deep Temporal Information (DTI) of video, named 3D CNN-GRU. And finally, the DSI and DTI features are combined with the statistical features of other influencing factors, including video quality level, re-buffering duration, re-buffering frequency, and so on, to form the feature parameter vector. The Gradient Boosting method is adopted to establish the mapping relationship model between the feature parameter vector and Mean Opinion Score (MOS), which can be used to predict the user's QoE. Experimental results on SQoE-III and SQoE-IV datasets demonstrate that the proposed QoE model can achieve the state-of-the-art performance compared with the existing QoE evaluation models.","PeriodicalId":414908,"journal":{"name":"Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Spatial and Temporal Information based QoE Evaluation Model for HTTP Adaptive Streaming\",\"authors\":\"L. Du, L. Zhuo, Jiafeng Li, Hui Zhang\",\"doi\":\"10.1145/3507548.3507608\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The content characteristics of video is one of the important influencing factors affecting the user's Quality of Experience (QoE). In this paper, deep spatial and temporal information are extracted to characterize the content characteristics of video, which are then used to establish a QoE evaluation model for HTTP adaptive streaming. Firstly, a Gabor convolutional layer and Channel Attention (CA) are incorporated into ResNet18 to construct the Gabor-CA-ResNet18 network, which is used to capture the Deep Spatial Information (DSI) of video. To avoid the problem of the \\\"curse of dimensionality\\\", LargeVis is applied to reduce the dimensionality of the DSI features to improve the representative and discriminative ability, obtaining a compact feature representation vector. Secondly, 3D Convolutional Neural Networks (3D CNN) and Gated Recurrent Unit (GRU) are used together to capture the Deep Temporal Information (DTI) of video, named 3D CNN-GRU. And finally, the DSI and DTI features are combined with the statistical features of other influencing factors, including video quality level, re-buffering duration, re-buffering frequency, and so on, to form the feature parameter vector. The Gradient Boosting method is adopted to establish the mapping relationship model between the feature parameter vector and Mean Opinion Score (MOS), which can be used to predict the user's QoE. Experimental results on SQoE-III and SQoE-IV datasets demonstrate that the proposed QoE model can achieve the state-of-the-art performance compared with the existing QoE evaluation models.\",\"PeriodicalId\":414908,\"journal\":{\"name\":\"Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3507548.3507608\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3507548.3507608","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep Spatial and Temporal Information based QoE Evaluation Model for HTTP Adaptive Streaming
The content characteristics of video is one of the important influencing factors affecting the user's Quality of Experience (QoE). In this paper, deep spatial and temporal information are extracted to characterize the content characteristics of video, which are then used to establish a QoE evaluation model for HTTP adaptive streaming. Firstly, a Gabor convolutional layer and Channel Attention (CA) are incorporated into ResNet18 to construct the Gabor-CA-ResNet18 network, which is used to capture the Deep Spatial Information (DSI) of video. To avoid the problem of the "curse of dimensionality", LargeVis is applied to reduce the dimensionality of the DSI features to improve the representative and discriminative ability, obtaining a compact feature representation vector. Secondly, 3D Convolutional Neural Networks (3D CNN) and Gated Recurrent Unit (GRU) are used together to capture the Deep Temporal Information (DTI) of video, named 3D CNN-GRU. And finally, the DSI and DTI features are combined with the statistical features of other influencing factors, including video quality level, re-buffering duration, re-buffering frequency, and so on, to form the feature parameter vector. The Gradient Boosting method is adopted to establish the mapping relationship model between the feature parameter vector and Mean Opinion Score (MOS), which can be used to predict the user's QoE. Experimental results on SQoE-III and SQoE-IV datasets demonstrate that the proposed QoE model can achieve the state-of-the-art performance compared with the existing QoE evaluation models.