基于强化学习的异构NFV环境流调度

Chun Jen Lin, Yan Luo, Liang-Min Wang, Li-De Chen
{"title":"基于强化学习的异构NFV环境流调度","authors":"Chun Jen Lin, Yan Luo, Liang-Min Wang, Li-De Chen","doi":"10.1109/nas51552.2021.9605395","DOIUrl":null,"url":null,"abstract":"Network function virtualization (NFV) allows net-work functions executed on general-purpose servers or virtual machines (VMs) instead of proprietary hardware, greatly improving the flexibility and scalability of network services. Recent trends in using programmable accelerators to speed up NFV performance introduce challenges in flow scheduling in a dynamic NFV environment. Reinforcement learning (RL) trains machine learning models for decision making to maximize returns in uncertain environments such as NFV. In this paper, we study the allocation of heterogeneous processors (CPUs and FPGAs) to minimize the delays of flows in the system. We conduct extensive simulations to evaluate the performance of reinforcement learning based scheduling algorithms such as Advantage Actor Critic (A2C), Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO), and compare with greedy policies. The results show that RL based schedulers can effectively learn from past experiences and converge to the optimal greedy policy. We also analyze in-depth how the policies lead to different processor utilization and flow processing time, and provide insights into these policies.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Flow Scheduling in a Heterogeneous NFV Environment using Reinforcement Learning\",\"authors\":\"Chun Jen Lin, Yan Luo, Liang-Min Wang, Li-De Chen\",\"doi\":\"10.1109/nas51552.2021.9605395\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Network function virtualization (NFV) allows net-work functions executed on general-purpose servers or virtual machines (VMs) instead of proprietary hardware, greatly improving the flexibility and scalability of network services. Recent trends in using programmable accelerators to speed up NFV performance introduce challenges in flow scheduling in a dynamic NFV environment. Reinforcement learning (RL) trains machine learning models for decision making to maximize returns in uncertain environments such as NFV. In this paper, we study the allocation of heterogeneous processors (CPUs and FPGAs) to minimize the delays of flows in the system. We conduct extensive simulations to evaluate the performance of reinforcement learning based scheduling algorithms such as Advantage Actor Critic (A2C), Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO), and compare with greedy policies. The results show that RL based schedulers can effectively learn from past experiences and converge to the optimal greedy policy. We also analyze in-depth how the policies lead to different processor utilization and flow processing time, and provide insights into these policies.\",\"PeriodicalId\":135930,\"journal\":{\"name\":\"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/nas51552.2021.9605395\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/nas51552.2021.9605395","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

网络功能虚拟化(NFV)允许在通用服务器或虚拟机上执行网络功能,而不是在专用硬件上执行,从而大大提高了网络服务的灵活性和可伸缩性。最近使用可编程加速器加速NFV性能的趋势给动态NFV环境中的流量调度带来了挑战。强化学习(RL)训练机器学习模型,用于决策制定,以在NFV等不确定环境中实现回报最大化。在本文中,我们研究了异构处理器(cpu和fpga)的分配,以最小化系统中的流延迟。我们进行了大量的模拟,以评估基于强化学习的调度算法的性能,如优势参与者批评家(A2C),信任区域策略优化(TRPO)和近端策略优化(PPO),并与贪婪策略进行比较。结果表明,基于强化学习的调度程序可以有效地从过去的经验中学习,并收敛到最优贪心策略。我们还深入分析了策略如何导致不同的处理器利用率和流处理时间,并提供了对这些策略的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Flow Scheduling in a Heterogeneous NFV Environment using Reinforcement Learning
Network function virtualization (NFV) allows net-work functions executed on general-purpose servers or virtual machines (VMs) instead of proprietary hardware, greatly improving the flexibility and scalability of network services. Recent trends in using programmable accelerators to speed up NFV performance introduce challenges in flow scheduling in a dynamic NFV environment. Reinforcement learning (RL) trains machine learning models for decision making to maximize returns in uncertain environments such as NFV. In this paper, we study the allocation of heterogeneous processors (CPUs and FPGAs) to minimize the delays of flows in the system. We conduct extensive simulations to evaluate the performance of reinforcement learning based scheduling algorithms such as Advantage Actor Critic (A2C), Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO), and compare with greedy policies. The results show that RL based schedulers can effectively learn from past experiences and converge to the optimal greedy policy. We also analyze in-depth how the policies lead to different processor utilization and flow processing time, and provide insights into these policies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信