P. Morales, Patricia Franco, A. Lozada, N. Jara, F. Calderón, Juan Pinto-Ríos, A. Leiva
{"title":"弹性光网络中资源分配的多频带环境光强化学习馆","authors":"P. Morales, Patricia Franco, A. Lozada, N. Jara, F. Calderón, Juan Pinto-Ríos, A. Leiva","doi":"10.23919/ONDM51796.2021.9492435","DOIUrl":null,"url":null,"abstract":"The use of additional fiber bands for optical communications -known as Multi-band or Band-division multiplexing (BDM) - allows to increase the traffic served in transparent optical networks. In recent years, many proposals have emerged as a solution for resource allocation in such multi-band architectures. This work presents a novel approach based on reinforcement learning (RL) techniques to accommodate multi-band elastic optical network resources. Two new environments were implemented and added to the Optical-RL-Gym toolkit considering four scenarios with different band availability. Six agents were tested in four real network topologies, contrasting their episode rewards on a large number of training steps. Results show Trust Region Policy Optimization (TRPO) as the best performing agent, with consistent output across all the scenarios and network topologies considered. In addition, we illustrate the blocking probability behavior in relation to the traffic load, and band usage distribution, allowing further discussions.","PeriodicalId":163553,"journal":{"name":"2021 International Conference on Optical Network Design and Modeling (ONDM)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Multi-band Environments for Optical Reinforcement Learning Gym for Resource Allocation in Elastic Optical Networks\",\"authors\":\"P. Morales, Patricia Franco, A. Lozada, N. Jara, F. Calderón, Juan Pinto-Ríos, A. Leiva\",\"doi\":\"10.23919/ONDM51796.2021.9492435\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The use of additional fiber bands for optical communications -known as Multi-band or Band-division multiplexing (BDM) - allows to increase the traffic served in transparent optical networks. In recent years, many proposals have emerged as a solution for resource allocation in such multi-band architectures. This work presents a novel approach based on reinforcement learning (RL) techniques to accommodate multi-band elastic optical network resources. Two new environments were implemented and added to the Optical-RL-Gym toolkit considering four scenarios with different band availability. Six agents were tested in four real network topologies, contrasting their episode rewards on a large number of training steps. Results show Trust Region Policy Optimization (TRPO) as the best performing agent, with consistent output across all the scenarios and network topologies considered. In addition, we illustrate the blocking probability behavior in relation to the traffic load, and band usage distribution, allowing further discussions.\",\"PeriodicalId\":163553,\"journal\":{\"name\":\"2021 International Conference on Optical Network Design and Modeling (ONDM)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Optical Network Design and Modeling (ONDM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/ONDM51796.2021.9492435\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Optical Network Design and Modeling (ONDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ONDM51796.2021.9492435","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
摘要
在光通信中使用额外的光纤频带——被称为多频带或带分多路复用(BDM)——可以增加透明光网络中服务的流量。近年来,针对这种多频带架构中的资源分配问题,出现了许多解决方案。本文提出了一种基于强化学习(RL)技术的新方法,以适应多波段弹性光网络资源。考虑到具有不同频段可用性的四种场景,实现了两个新环境并将其添加到Optical-RL-Gym工具包中。在四种真实网络拓扑中测试了六个智能体,对比了它们在大量训练步骤上的情节奖励。结果表明,信任区域策略优化(Trust Region Policy Optimization, TRPO)是性能最好的代理,在所有考虑的场景和网络拓扑中都有一致的输出。此外,我们还说明了与流量负载和频带使用分布相关的阻塞概率行为,以便进一步讨论。
Multi-band Environments for Optical Reinforcement Learning Gym for Resource Allocation in Elastic Optical Networks
The use of additional fiber bands for optical communications -known as Multi-band or Band-division multiplexing (BDM) - allows to increase the traffic served in transparent optical networks. In recent years, many proposals have emerged as a solution for resource allocation in such multi-band architectures. This work presents a novel approach based on reinforcement learning (RL) techniques to accommodate multi-band elastic optical network resources. Two new environments were implemented and added to the Optical-RL-Gym toolkit considering four scenarios with different band availability. Six agents were tested in four real network topologies, contrasting their episode rewards on a large number of training steps. Results show Trust Region Policy Optimization (TRPO) as the best performing agent, with consistent output across all the scenarios and network topologies considered. In addition, we illustrate the blocking probability behavior in relation to the traffic load, and band usage distribution, allowing further discussions.