Piotr Lechowicz;Carlos Natalino;Filippo Cugini;Francesco Paolucci;Paolo Monti
{"title":"Optimizing telemetry forwarding for distributed failure recovery in packet-optical networks","authors":"Piotr Lechowicz;Carlos Natalino;Filippo Cugini;Francesco Paolucci;Paolo Monti","doi":"10.1364/JOCN.534559","DOIUrl":null,"url":null,"abstract":"Fast network recoverability from hard and soft failures is crucial for network operators to deliver uninterrupted services. Streaming telemetry has been studied as a solution for enabling fast and accurate failure detection in optical networks. However, significant delay is incurred when relying on a centralized entity (e.g., software-defined network controller) to collect, process, and act on telemetry data. Programmable switches (e.g., P4-based) allow telemetry data to be processed at line speed, enabling local on-device (distributed) decisions. These devices can be used to deploy quick and local mitigation to failures while a global solution is being computed on a longer time scale. However, designing network-wide streaming telemetry with distributed decisions remains an open challenge. In this work, we specify the joint optimization of packet-optical networks with on-device failure recovery, considering multiple aspects of the problem. The problem is modeled using linear programming and solved for multiple network realizations. The solutions can be used to program each switch in the network to detect failures and quickly recover the traffic. Results show that the proposed model decreases the required number of register entries to store telemetry data while assuring high recoverability and a minimized number of wavelengths.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 2","pages":"152-162"},"PeriodicalIF":4.0000,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Optical Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10857708/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Fast network recoverability from hard and soft failures is crucial for network operators to deliver uninterrupted services. Streaming telemetry has been studied as a solution for enabling fast and accurate failure detection in optical networks. However, significant delay is incurred when relying on a centralized entity (e.g., software-defined network controller) to collect, process, and act on telemetry data. Programmable switches (e.g., P4-based) allow telemetry data to be processed at line speed, enabling local on-device (distributed) decisions. These devices can be used to deploy quick and local mitigation to failures while a global solution is being computed on a longer time scale. However, designing network-wide streaming telemetry with distributed decisions remains an open challenge. In this work, we specify the joint optimization of packet-optical networks with on-device failure recovery, considering multiple aspects of the problem. The problem is modeled using linear programming and solved for multiple network realizations. The solutions can be used to program each switch in the network to detect failures and quickly recover the traffic. Results show that the proposed model decreases the required number of register entries to store telemetry data while assuring high recoverability and a minimized number of wavelengths.
期刊介绍:
The scope of the Journal includes advances in the state-of-the-art of optical networking science, technology, and engineering. Both theoretical contributions (including new techniques, concepts, analyses, and economic studies) and practical contributions (including optical networking experiments, prototypes, and new applications) are encouraged. Subareas of interest include the architecture and design of optical networks, optical network survivability and security, software-defined optical networking, elastic optical networks, data and control plane advances, network management related innovation, and optical access networks. Enabling technologies and their applications are suitable topics only if the results are shown to directly impact optical networking beyond simple point-to-point networks.