QFlow: A Reinforcement Learning Approach to High QoE Video Streaming over Wireless Networks

Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing Pub Date : 2019-01-04 DOI:10.1145/3323679.3326523

Rajarshi Bhattacharyya, Archana Bura, Desik Rengarajan, Mason Rumuly, S. Shakkottai, D. Kalathil, Ricky K. P. Mok, A. Dhamdhere

{"title":"QFlow: A Reinforcement Learning Approach to High QoE Video Streaming over Wireless Networks","authors":"Rajarshi Bhattacharyya, Archana Bura, Desik Rengarajan, Mason Rumuly, S. Shakkottai, D. Kalathil, Ricky K. P. Mok, A. Dhamdhere","doi":"10.1145/3323679.3326523","DOIUrl":null,"url":null,"abstract":"Wireless Internet access has brought legions of heterogeneous applications all sharing the same resources. However, current wireless edge networks that cater to worst or average case performance lack the agility to best serve these diverse sessions. Simultaneously, software reconfigurable infrastructure has become increasingly mainstream to the point that dynamic per packet and per flow decisions are possible at multiple layers of the communications stack. Exploiting such reconfigurability requires the design of a system that can enable a configuration, measure the impact on the application performance (Quality of Experience), and adaptively select a new configuration. Effectively, this feedback loop is a Markov Decision Process whose parameters are unknown. The goal of this work is to design, develop and demonstrate QFlow that instantiates this feedback loop as an application of reinforcement learning (RL). Our context is that of reconfigurable (priority) queueing, and we use the popular application of video streaming as our use case. We develop both model-free and model-based RL approaches that are tailored to the problem of determining which clients should be assigned to which queue at each decision period. Through experimental validation, we show how the RL-based control policies on QFlow are able to schedule the right clients for prioritization in a high-load scenario to outperform the status quo, as well as the best known solutions with over 25% improvement in QoE, and a perfect QoE score of 5 over 85% of the time.","PeriodicalId":205641,"journal":{"name":"Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"41","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3323679.3326523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 41

Abstract

Wireless Internet access has brought legions of heterogeneous applications all sharing the same resources. However, current wireless edge networks that cater to worst or average case performance lack the agility to best serve these diverse sessions. Simultaneously, software reconfigurable infrastructure has become increasingly mainstream to the point that dynamic per packet and per flow decisions are possible at multiple layers of the communications stack. Exploiting such reconfigurability requires the design of a system that can enable a configuration, measure the impact on the application performance (Quality of Experience), and adaptively select a new configuration. Effectively, this feedback loop is a Markov Decision Process whose parameters are unknown. The goal of this work is to design, develop and demonstrate QFlow that instantiates this feedback loop as an application of reinforcement learning (RL). Our context is that of reconfigurable (priority) queueing, and we use the popular application of video streaming as our use case. We develop both model-free and model-based RL approaches that are tailored to the problem of determining which clients should be assigned to which queue at each decision period. Through experimental validation, we show how the RL-based control policies on QFlow are able to schedule the right clients for prioritization in a high-load scenario to outperform the status quo, as well as the best known solutions with over 25% improvement in QoE, and a perfect QoE score of 5 over 85% of the time.

查看原文本刊更多论文

QFlow:无线网络上高QoE视频流的强化学习方法

无线互联网接入带来了大量共享相同资源的异构应用程序。然而，目前的无线边缘网络只能满足最差或平均性能，缺乏灵活性，无法最好地服务于这些不同的会话。同时，软件可重构的基础设施已经变得越来越主流，以至于在通信堆栈的多个层上可以动态地进行每个数据包和每个流的决策。利用这种可重构性需要设计一个能够启用配置的系统，测量对应用程序性能的影响(体验质量)，并自适应地选择新的配置。实际上，这个反馈回路是一个参数未知的马尔可夫决策过程。这项工作的目标是设计、开发和演示QFlow，将这种反馈循环实例化为强化学习(RL)的应用。我们的上下文是可重构(优先级)队列，我们使用流行的视频流应用程序作为我们的用例。我们开发了无模型和基于模型的强化学习方法，这些方法专门用于确定在每个决策周期应将哪些客户端分配到哪个队列的问题。通过实验验证，我们展示了QFlow上基于rl的控制策略如何能够在高负载场景下调度正确的客户端进行优先级排序，从而超越现状，以及最著名的解决方案，QoE提高了25%以上，并且在85%的时间内达到了完美的QoE分数5。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing

自引率

0.00%

发文量