Alex Huang Feng;Pierre Francois;Maxence Younsi;Stéphane Frénot;Thomas Graf;Wanting Du;Paolo Lucente;Ahmed Elhassany
{"title":"Detecting Service Disruptions in Large BGP/MPLS VPN Networks","authors":"Alex Huang Feng;Pierre Francois;Maxence Younsi;Stéphane Frénot;Thomas Graf;Wanting Du;Paolo Lucente;Ahmed Elhassany","doi":"10.1109/TNSM.2025.3588314","DOIUrl":null,"url":null,"abstract":"This paper presents the result of three years of experience in research, design, and deployment of a complete architecture aimed at automatically identifying service disruptions in large BGP/MPLS VPN networks. We present the main components of a comprehensive architecture that can be operated in production environments, highlighting the requirements that led to their design. We describe the data that are collected from the network using IETF standard protocols, the processing that is performed onto them to detect anomalies, and the scaling aspects that need to be considered when ingesting the large amounts of data that is necessary for the purpose at hand. We report on two and a half years of deployment experience on the Swisscom BGP/MPLS VPN Network services, by analyzing the behavior of our system in the face of actual network incidents. After each incident, we systematically performed post-mortem analyzes. These investigations led us to conclude that the rule-based approaches that are currently used in deployment, supported by a profiling of the VPN customers to fine-tune rule parameters, enables the detection of service disruptions with the required accuracy.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"3964-3977"},"PeriodicalIF":5.4000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Network and Service Management","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11078445/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents the result of three years of experience in research, design, and deployment of a complete architecture aimed at automatically identifying service disruptions in large BGP/MPLS VPN networks. We present the main components of a comprehensive architecture that can be operated in production environments, highlighting the requirements that led to their design. We describe the data that are collected from the network using IETF standard protocols, the processing that is performed onto them to detect anomalies, and the scaling aspects that need to be considered when ingesting the large amounts of data that is necessary for the purpose at hand. We report on two and a half years of deployment experience on the Swisscom BGP/MPLS VPN Network services, by analyzing the behavior of our system in the face of actual network incidents. After each incident, we systematically performed post-mortem analyzes. These investigations led us to conclude that the rule-based approaches that are currently used in deployment, supported by a profiling of the VPN customers to fine-tune rule parameters, enables the detection of service disruptions with the required accuracy.
期刊介绍:
IEEE Transactions on Network and Service Management will publish (online only) peerreviewed archival quality papers that advance the state-of-the-art and practical applications of network and service management. Theoretical research contributions (presenting new concepts and techniques) and applied contributions (reporting on experiences and experiments with actual systems) will be encouraged. These transactions will focus on the key technical issues related to: Management Models, Architectures and Frameworks; Service Provisioning, Reliability and Quality Assurance; Management Functions; Enabling Technologies; Information and Communication Models; Policies; Applications and Case Studies; Emerging Technologies and Standards.