QFlow: A Reinforcement Learning Approach to High QoE Video Streaming over Wireless Networks

Rajarshi Bhattacharyya, Archana Bura, Desik Rengarajan, Mason Rumuly, S. Shakkottai, D. Kalathil, Ricky K. P. Mok, A. Dhamdhere
{"title":"QFlow: A Reinforcement Learning Approach to High QoE Video Streaming over Wireless Networks","authors":"Rajarshi Bhattacharyya, Archana Bura, Desik Rengarajan, Mason Rumuly, S. Shakkottai, D. Kalathil, Ricky K. P. Mok, A. Dhamdhere","doi":"10.1145/3323679.3326523","DOIUrl":null,"url":null,"abstract":"Wireless Internet access has brought legions of heterogeneous applications all sharing the same resources. However, current wireless edge networks that cater to worst or average case performance lack the agility to best serve these diverse sessions. Simultaneously, software reconfigurable infrastructure has become increasingly mainstream to the point that dynamic per packet and per flow decisions are possible at multiple layers of the communications stack. Exploiting such reconfigurability requires the design of a system that can enable a configuration, measure the impact on the application performance (Quality of Experience), and adaptively select a new configuration. Effectively, this feedback loop is a Markov Decision Process whose parameters are unknown. The goal of this work is to design, develop and demonstrate QFlow that instantiates this feedback loop as an application of reinforcement learning (RL). Our context is that of reconfigurable (priority) queueing, and we use the popular application of video streaming as our use case. We develop both model-free and model-based RL approaches that are tailored to the problem of determining which clients should be assigned to which queue at each decision period. Through experimental validation, we show how the RL-based control policies on QFlow are able to schedule the right clients for prioritization in a high-load scenario to outperform the status quo, as well as the best known solutions with over 25% improvement in QoE, and a perfect QoE score of 5 over 85% of the time.","PeriodicalId":205641,"journal":{"name":"Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"41","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3323679.3326523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 41

Abstract

Wireless Internet access has brought legions of heterogeneous applications all sharing the same resources. However, current wireless edge networks that cater to worst or average case performance lack the agility to best serve these diverse sessions. Simultaneously, software reconfigurable infrastructure has become increasingly mainstream to the point that dynamic per packet and per flow decisions are possible at multiple layers of the communications stack. Exploiting such reconfigurability requires the design of a system that can enable a configuration, measure the impact on the application performance (Quality of Experience), and adaptively select a new configuration. Effectively, this feedback loop is a Markov Decision Process whose parameters are unknown. The goal of this work is to design, develop and demonstrate QFlow that instantiates this feedback loop as an application of reinforcement learning (RL). Our context is that of reconfigurable (priority) queueing, and we use the popular application of video streaming as our use case. We develop both model-free and model-based RL approaches that are tailored to the problem of determining which clients should be assigned to which queue at each decision period. Through experimental validation, we show how the RL-based control policies on QFlow are able to schedule the right clients for prioritization in a high-load scenario to outperform the status quo, as well as the best known solutions with over 25% improvement in QoE, and a perfect QoE score of 5 over 85% of the time.
QFlow:无线网络上高QoE视频流的强化学习方法
无线互联网接入带来了大量共享相同资源的异构应用程序。然而,目前的无线边缘网络只能满足最差或平均性能,缺乏灵活性,无法最好地服务于这些不同的会话。同时,软件可重构的基础设施已经变得越来越主流,以至于在通信堆栈的多个层上可以动态地进行每个数据包和每个流的决策。利用这种可重构性需要设计一个能够启用配置的系统,测量对应用程序性能的影响(体验质量),并自适应地选择新的配置。实际上,这个反馈回路是一个参数未知的马尔可夫决策过程。这项工作的目标是设计、开发和演示QFlow,将这种反馈循环实例化为强化学习(RL)的应用。我们的上下文是可重构(优先级)队列,我们使用流行的视频流应用程序作为我们的用例。我们开发了无模型和基于模型的强化学习方法,这些方法专门用于确定在每个决策周期应将哪些客户端分配到哪个队列的问题。通过实验验证,我们展示了QFlow上基于rl的控制策略如何能够在高负载场景下调度正确的客户端进行优先级排序,从而超越现状,以及最著名的解决方案,QoE提高了25%以上,并且在85%的时间内达到了完美的QoE分数5。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信