{"title":"DRL Empowered On-policy and Off-policy ABR for 5G Mobile Ultra-HD Video Delivery","authors":"","doi":"10.1007/s11036-024-02311-1","DOIUrl":null,"url":null,"abstract":"<h3>Abstract</h3> <p>Fifth generation (5G) and beyond 5G networks support high-throughput ultra-high definition (UHD) video applications. This paper examines the use of dynamic adaptive streaming over HTTP (DASH) to deliver UHD videos from servers to 5G-capable devices. Due to the dynamic network conditions of wireless networks, it is particularly challenging to provide a high quality of experience (QoE) for UHD video delivery. Consequently, adaptive bit rate (ABR) algorithms are developed to adapt the video bit rate to the network conditions. To improve QoE, several ABR algorithms are developed, the majority of which are based on predetermined rules. Therefore, they do not apply to a broad variety of network conditions. Recent research has shown that ABR algorithms powered by deep reinforcement learning (DRL) based vanilla asynchronous advantage actor-critic (A3C) methods are more effective at generalizing to different network conditions. However, they have some limitations, such as a lag between behavior and target policies, sample inefficiency, and sensitivity to the environment’s randomness. In this paper, we propose the design and implementation of two DRL-empowered ABR algorithms: (i) on-policy proximal policy optimization adaptive bit rate (PPO-ABR), and (ii) off-policy soft-actor critic adaptive bit rate (SAC-ABR). We evaluate the proposed algorithms using 5G traces from the Lumos 5G dataset and show that by utilizing specific properties of on-policy and off-policy methods, our proposed methods perform much better than vanilla A3C for different variations of QoE metrics.</p>","PeriodicalId":501103,"journal":{"name":"Mobile Networks and Applications","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mobile Networks and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11036-024-02311-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Fifth generation (5G) and beyond 5G networks support high-throughput ultra-high definition (UHD) video applications. This paper examines the use of dynamic adaptive streaming over HTTP (DASH) to deliver UHD videos from servers to 5G-capable devices. Due to the dynamic network conditions of wireless networks, it is particularly challenging to provide a high quality of experience (QoE) for UHD video delivery. Consequently, adaptive bit rate (ABR) algorithms are developed to adapt the video bit rate to the network conditions. To improve QoE, several ABR algorithms are developed, the majority of which are based on predetermined rules. Therefore, they do not apply to a broad variety of network conditions. Recent research has shown that ABR algorithms powered by deep reinforcement learning (DRL) based vanilla asynchronous advantage actor-critic (A3C) methods are more effective at generalizing to different network conditions. However, they have some limitations, such as a lag between behavior and target policies, sample inefficiency, and sensitivity to the environment’s randomness. In this paper, we propose the design and implementation of two DRL-empowered ABR algorithms: (i) on-policy proximal policy optimization adaptive bit rate (PPO-ABR), and (ii) off-policy soft-actor critic adaptive bit rate (SAC-ABR). We evaluate the proposed algorithms using 5G traces from the Lumos 5G dataset and show that by utilizing specific properties of on-policy and off-policy methods, our proposed methods perform much better than vanilla A3C for different variations of QoE metrics.