P. Gonzalez;F. Alhamed;H. Shakespear-Miles;S. Barzegar;F. Paolucci;A. Sgambelluri;J. J. Vegas Olmos;M. Ruiz;L. Velasco
{"title":"Near-real-time 6G service operation enabled by distributed intelligence and in-band telemetry","authors":"P. Gonzalez;F. Alhamed;H. Shakespear-Miles;S. Barzegar;F. Paolucci;A. Sgambelluri;J. J. Vegas Olmos;M. Ruiz;L. Velasco","doi":"10.1364/JOCN.533789","DOIUrl":null,"url":null,"abstract":"The combination of highly dynamic network services requiring stringent quality of service (QoS), especially in terms of end-to-end (e2e) delay, together with capital and operational cost reduction cannot be faced using centralized software-defined networking (SDN) solutions only. In particular, such expected dynamicity requires autonomous near-real-time operation fed with pervasive telemetry to make per-service decisions that ensure the committed QoS, while reducing overprovisioning as much as possible. In this paper, we propose a distributed control architecture based on multi-agent systems (MASs) to assist the SDN controller in the control of network services near-real-time. Per-traffic flow telemetry data are collected from the packet nodes, distributed through the agents in the control plane, and analyzed to assure performance and to anticipate any degradation. Measurements feed flow agents, which are based on deep reinforcement learning (DRL) models, to make routing decisions aiming at ensuring flow performance. In the case when QoS degradation is detected, we propose algorithms to analyze its cause, which can be a result of some bottleneck in the network. We show how the latter is detected and additional capacity is requested of the SDN controller, which in turn creates an optical bypass to provide additional capacity. The proposed solution is demonstrated experimentally on a federated testbed connecting UPC and CNIT premises. Focused first on the control plane, the feasibility of the proposed architecture and workflows is experimentally assessed. After that, the performance of the near-real-time operation is evaluated at the data plane to verify that the maximum e2e delay is not exceeded for multiple flows, showing the effectiveness of predictive QoS evaluation together with infrastructure and service reconfiguration.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 3","pages":"A247-A258"},"PeriodicalIF":4.0000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Optical Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10906307/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The combination of highly dynamic network services requiring stringent quality of service (QoS), especially in terms of end-to-end (e2e) delay, together with capital and operational cost reduction cannot be faced using centralized software-defined networking (SDN) solutions only. In particular, such expected dynamicity requires autonomous near-real-time operation fed with pervasive telemetry to make per-service decisions that ensure the committed QoS, while reducing overprovisioning as much as possible. In this paper, we propose a distributed control architecture based on multi-agent systems (MASs) to assist the SDN controller in the control of network services near-real-time. Per-traffic flow telemetry data are collected from the packet nodes, distributed through the agents in the control plane, and analyzed to assure performance and to anticipate any degradation. Measurements feed flow agents, which are based on deep reinforcement learning (DRL) models, to make routing decisions aiming at ensuring flow performance. In the case when QoS degradation is detected, we propose algorithms to analyze its cause, which can be a result of some bottleneck in the network. We show how the latter is detected and additional capacity is requested of the SDN controller, which in turn creates an optical bypass to provide additional capacity. The proposed solution is demonstrated experimentally on a federated testbed connecting UPC and CNIT premises. Focused first on the control plane, the feasibility of the proposed architecture and workflows is experimentally assessed. After that, the performance of the near-real-time operation is evaluated at the data plane to verify that the maximum e2e delay is not exceeded for multiple flows, showing the effectiveness of predictive QoS evaluation together with infrastructure and service reconfiguration.
期刊介绍:
The scope of the Journal includes advances in the state-of-the-art of optical networking science, technology, and engineering. Both theoretical contributions (including new techniques, concepts, analyses, and economic studies) and practical contributions (including optical networking experiments, prototypes, and new applications) are encouraged. Subareas of interest include the architecture and design of optical networks, optical network survivability and security, software-defined optical networking, elastic optical networks, data and control plane advances, network management related innovation, and optical access networks. Enabling technologies and their applications are suitable topics only if the results are shown to directly impact optical networking beyond simple point-to-point networks.