用于 MEC 自适应 360 度视频流的两阶段深度强化学习框架

IF 7.7 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Mobile Computing Pub Date : 2024-08-13 DOI:10.1109/TMC.2024.3443200

Suzhi Bi;Haoguo Chen;Xian Li;Shuoyao Wang;Yuan Wu;Liping Qian

{"title":"用于 MEC 自适应 360 度视频流的两阶段深度强化学习框架","authors":"Suzhi Bi;Haoguo Chen;Xian Li;Shuoyao Wang;Yuan Wu;Liping Qian","doi":"10.1109/TMC.2024.3443200","DOIUrl":null,"url":null,"abstract":"The emerging multi-access edge computing (MEC) technology effectively enhances the wireless streaming performance of 360-degree videos. By connecting a user's head-mounted device (HMD) to a smart MEC platform, the edge server (ES) can efficiently perform adaptive tile-based video streaming to improve the user's viewing experience. Under constrained wireless channel capacity, the ES can predict the user's field of view (FoV) and transmit to the HMD high-resolution video tiles only within the predicted FoV. In practice, the video streaming performance is challenged by the random FoV prediction error and wireless channel fading effects. For this, we propose in this paper a novel two-stage adaptive 360-degree video streaming scheme that maximizes the user's quality of experience (QoE) to attain stable and high-resolution video playback. Specifically, we divide the video file into groups of pictures (GOPs) of fixed playback interval, where each GOP consists of a number of video frames. At the beginning of each GOP (i.e., the inter-GOP stage), the ES predicts the FoV of the next GOP and allocates an encoding bitrate for transmitting (precaching) the video tiles within the predicted FoV. Then, during the real-time video playback of the current GOP (i.e., the intra-GOP stage), the ES observes the user's true FoV of each frame and transmits the missing tiles to compensate for the FoV prediction errors. To maximize the user's QoE under random variations of FoV and wireless channel, we propose a double-agent deep reinforcement learning framework, where the two agents operate in different time scales to decide the bitrates of inter- and intra-GOP stages, respectively. Experiments based on real-world measurements show that the proposed scheme can effectively mitigate FoV prediction errors and maintain stable QoE performance under different scenarios, achieving over 22.1% higher QoE than some representative benchmark methods.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"23 12","pages":"14313-14329"},"PeriodicalIF":7.7000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Two-Stage Deep Reinforcement Learning Framework for MEC-Enabled Adaptive 360-Degree Video Streaming\",\"authors\":\"Suzhi Bi;Haoguo Chen;Xian Li;Shuoyao Wang;Yuan Wu;Liping Qian\",\"doi\":\"10.1109/TMC.2024.3443200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The emerging multi-access edge computing (MEC) technology effectively enhances the wireless streaming performance of 360-degree videos. By connecting a user's head-mounted device (HMD) to a smart MEC platform, the edge server (ES) can efficiently perform adaptive tile-based video streaming to improve the user's viewing experience. Under constrained wireless channel capacity, the ES can predict the user's field of view (FoV) and transmit to the HMD high-resolution video tiles only within the predicted FoV. In practice, the video streaming performance is challenged by the random FoV prediction error and wireless channel fading effects. For this, we propose in this paper a novel two-stage adaptive 360-degree video streaming scheme that maximizes the user's quality of experience (QoE) to attain stable and high-resolution video playback. Specifically, we divide the video file into groups of pictures (GOPs) of fixed playback interval, where each GOP consists of a number of video frames. At the beginning of each GOP (i.e., the inter-GOP stage), the ES predicts the FoV of the next GOP and allocates an encoding bitrate for transmitting (precaching) the video tiles within the predicted FoV. Then, during the real-time video playback of the current GOP (i.e., the intra-GOP stage), the ES observes the user's true FoV of each frame and transmits the missing tiles to compensate for the FoV prediction errors. To maximize the user's QoE under random variations of FoV and wireless channel, we propose a double-agent deep reinforcement learning framework, where the two agents operate in different time scales to decide the bitrates of inter- and intra-GOP stages, respectively. Experiments based on real-world measurements show that the proposed scheme can effectively mitigate FoV prediction errors and maintain stable QoE performance under different scenarios, achieving over 22.1% higher QoE than some representative benchmark methods.\",\"PeriodicalId\":50389,\"journal\":{\"name\":\"IEEE Transactions on Mobile Computing\",\"volume\":\"23 12\",\"pages\":\"14313-14329\"},\"PeriodicalIF\":7.7000,\"publicationDate\":\"2024-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Mobile Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10634800/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10634800/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

新兴的多接入边缘计算（MEC）技术可有效提高 360 度视频的无线流媒体性能。通过将用户的头戴式设备（HMD）连接到智能 MEC 平台，边缘服务器（ES）可以有效地执行基于磁贴的自适应视频流，从而改善用户的观看体验。在无线信道容量受限的情况下，ES 可以预测用户的视场（FoV），并仅在预测的 FoV 范围内向 HMD 传输高分辨率视频磁贴。在实际应用中，随机视场角预测误差和无线信道衰落效应对视频流性能提出了挑战。为此，我们在本文中提出了一种新颖的两阶段自适应 360 度视频流方案，它能最大限度地提高用户的体验质量（QoE），实现稳定的高分辨率视频播放。具体来说，我们将视频文件分为固定播放间隔的图片组（GOP），每个 GOP 由若干视频帧组成。在每个 GOP 的开始阶段（即 GOP 间阶段），ES 会预测下一个 GOP 的 FoV，并分配一个编码比特率用于传输（预缓存）预测 FoV 内的视频片段。然后，在当前 GOP 的实时视频播放过程中（即 GOP 内阶段），ES 观察用户每帧的真实视场角，并传输缺失的片段以补偿视场角预测误差。为了在 FoV 和无线信道随机变化的情况下最大限度地提高用户的 QoE，我们提出了一种双代理深度强化学习框架，其中两个代理在不同的时间尺度内运行，分别决定 GOP 间阶段和 GOP 内阶段的比特率。基于实际测量的实验表明，所提出的方案可以有效地减少 FoV 预测误差，并在不同场景下保持稳定的 QoE 性能，与一些具有代表性的基准方法相比，QoE 高出 22.1% 以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Two-Stage Deep Reinforcement Learning Framework for MEC-Enabled Adaptive 360-Degree Video Streaming

The emerging multi-access edge computing (MEC) technology effectively enhances the wireless streaming performance of 360-degree videos. By connecting a user's head-mounted device (HMD) to a smart MEC platform, the edge server (ES) can efficiently perform adaptive tile-based video streaming to improve the user's viewing experience. Under constrained wireless channel capacity, the ES can predict the user's field of view (FoV) and transmit to the HMD high-resolution video tiles only within the predicted FoV. In practice, the video streaming performance is challenged by the random FoV prediction error and wireless channel fading effects. For this, we propose in this paper a novel two-stage adaptive 360-degree video streaming scheme that maximizes the user's quality of experience (QoE) to attain stable and high-resolution video playback. Specifically, we divide the video file into groups of pictures (GOPs) of fixed playback interval, where each GOP consists of a number of video frames. At the beginning of each GOP (i.e., the inter-GOP stage), the ES predicts the FoV of the next GOP and allocates an encoding bitrate for transmitting (precaching) the video tiles within the predicted FoV. Then, during the real-time video playback of the current GOP (i.e., the intra-GOP stage), the ES observes the user's true FoV of each frame and transmits the missing tiles to compensate for the FoV prediction errors. To maximize the user's QoE under random variations of FoV and wireless channel, we propose a double-agent deep reinforcement learning framework, where the two agents operate in different time scales to decide the bitrates of inter- and intra-GOP stages, respectively. Experiments based on real-world measurements show that the proposed scheme can effectively mitigate FoV prediction errors and maintain stable QoE performance under different scenarios, achieving over 22.1% higher QoE than some representative benchmark methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Mobile Computing 工程技术-电信学

CiteScore

12.90

自引率

2.50%

发文量

403

审稿时长

6.6 months

期刊介绍： IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.