{"title":"SDN中基于贡献分配在线路由的多智能体强化学习","authors":"Xiaofeng Yue, Wu Lijun, Weiwei Duan","doi":"10.1109/ICCWAMTIP56608.2022.10016566","DOIUrl":null,"url":null,"abstract":"Emerging applications place critical QoS requirements on the Internet. Networks need to guarantee different quality of service (QoS) requirements for different data flows for various Internet services. Improvements in traffic classification techniques, software-defined networking (SDN), and programmable network devices make it possible to quickly identify user requirements and control the routing of fine-grained traffic. In this paper, we propose CBR, an online routing algorithm using multi-agent deep reinforcement learning. CBR uses GCN to extract topology features, designs different reward functions to learn appropriate routing policies for different types of traffic demands, and organizes agents to generate routes in a hop-by-hop approach. In addition, to address the challenge of not being able to distinguish whether the actions made by each agent are critical or not due to shared reward values, CBR designs a new baseline to indicate their contribution level. Finally, to ensure reliability and speed up training, we use pre-training to learn shortest path rules to obtain initial parameters to speed up training and introduce a routing alternative mechanism to provide security for online routing. We conducted Mininet-based experiments using Abilene and GEANT network topologies. The experimental results show that CBR is able to simultaneously meet the demands of different service types for their requested traffic while performing well in terms of reliability in the case of link failures.","PeriodicalId":159508,"journal":{"name":"2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Agent Reinforcement Learning With Contribution-Based Assignment Online Routing In SDN\",\"authors\":\"Xiaofeng Yue, Wu Lijun, Weiwei Duan\",\"doi\":\"10.1109/ICCWAMTIP56608.2022.10016566\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Emerging applications place critical QoS requirements on the Internet. Networks need to guarantee different quality of service (QoS) requirements for different data flows for various Internet services. Improvements in traffic classification techniques, software-defined networking (SDN), and programmable network devices make it possible to quickly identify user requirements and control the routing of fine-grained traffic. In this paper, we propose CBR, an online routing algorithm using multi-agent deep reinforcement learning. CBR uses GCN to extract topology features, designs different reward functions to learn appropriate routing policies for different types of traffic demands, and organizes agents to generate routes in a hop-by-hop approach. In addition, to address the challenge of not being able to distinguish whether the actions made by each agent are critical or not due to shared reward values, CBR designs a new baseline to indicate their contribution level. Finally, to ensure reliability and speed up training, we use pre-training to learn shortest path rules to obtain initial parameters to speed up training and introduce a routing alternative mechanism to provide security for online routing. We conducted Mininet-based experiments using Abilene and GEANT network topologies. The experimental results show that CBR is able to simultaneously meet the demands of different service types for their requested traffic while performing well in terms of reliability in the case of link failures.\",\"PeriodicalId\":159508,\"journal\":{\"name\":\"2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCWAMTIP56608.2022.10016566\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCWAMTIP56608.2022.10016566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-Agent Reinforcement Learning With Contribution-Based Assignment Online Routing In SDN
Emerging applications place critical QoS requirements on the Internet. Networks need to guarantee different quality of service (QoS) requirements for different data flows for various Internet services. Improvements in traffic classification techniques, software-defined networking (SDN), and programmable network devices make it possible to quickly identify user requirements and control the routing of fine-grained traffic. In this paper, we propose CBR, an online routing algorithm using multi-agent deep reinforcement learning. CBR uses GCN to extract topology features, designs different reward functions to learn appropriate routing policies for different types of traffic demands, and organizes agents to generate routes in a hop-by-hop approach. In addition, to address the challenge of not being able to distinguish whether the actions made by each agent are critical or not due to shared reward values, CBR designs a new baseline to indicate their contribution level. Finally, to ensure reliability and speed up training, we use pre-training to learn shortest path rules to obtain initial parameters to speed up training and introduce a routing alternative mechanism to provide security for online routing. We conducted Mininet-based experiments using Abilene and GEANT network topologies. The experimental results show that CBR is able to simultaneously meet the demands of different service types for their requested traffic while performing well in terms of reliability in the case of link failures.