{"title":"Flow Scheduling in a Heterogeneous NFV Environment using Reinforcement Learning","authors":"Chun Jen Lin, Yan Luo, Liang-Min Wang, Li-De Chen","doi":"10.1109/nas51552.2021.9605395","DOIUrl":null,"url":null,"abstract":"Network function virtualization (NFV) allows net-work functions executed on general-purpose servers or virtual machines (VMs) instead of proprietary hardware, greatly improving the flexibility and scalability of network services. Recent trends in using programmable accelerators to speed up NFV performance introduce challenges in flow scheduling in a dynamic NFV environment. Reinforcement learning (RL) trains machine learning models for decision making to maximize returns in uncertain environments such as NFV. In this paper, we study the allocation of heterogeneous processors (CPUs and FPGAs) to minimize the delays of flows in the system. We conduct extensive simulations to evaluate the performance of reinforcement learning based scheduling algorithms such as Advantage Actor Critic (A2C), Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO), and compare with greedy policies. The results show that RL based schedulers can effectively learn from past experiences and converge to the optimal greedy policy. We also analyze in-depth how the policies lead to different processor utilization and flow processing time, and provide insights into these policies.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/nas51552.2021.9605395","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Network function virtualization (NFV) allows net-work functions executed on general-purpose servers or virtual machines (VMs) instead of proprietary hardware, greatly improving the flexibility and scalability of network services. Recent trends in using programmable accelerators to speed up NFV performance introduce challenges in flow scheduling in a dynamic NFV environment. Reinforcement learning (RL) trains machine learning models for decision making to maximize returns in uncertain environments such as NFV. In this paper, we study the allocation of heterogeneous processors (CPUs and FPGAs) to minimize the delays of flows in the system. We conduct extensive simulations to evaluate the performance of reinforcement learning based scheduling algorithms such as Advantage Actor Critic (A2C), Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO), and compare with greedy policies. The results show that RL based schedulers can effectively learn from past experiences and converge to the optimal greedy policy. We also analyze in-depth how the policies lead to different processor utilization and flow processing time, and provide insights into these policies.