Chen Xu, Xiaoban Wu, Yan Luo, B. Tierney, Jeronimo Bezerra
{"title":"Pepple: Programmable network measurement for troubleshooting soft failures","authors":"Chen Xu, Xiaoban Wu, Yan Luo, B. Tierney, Jeronimo Bezerra","doi":"10.1109/SARNOF.2016.7846743","DOIUrl":null,"url":null,"abstract":"Networks have been expanding in scale and speeds, however, it is difficult to troubleshoot network problems because of specific measurement policies and services in different administrative domains. Moreover, many network issues are very subtle, e.g. a link becomes increasing slow but still connected, where active measurement is instrumental. While many measurement infrastructures have been developed and used, the measurement and troubleshooting process typically requires human intervention and leads to inefficiency. In this work, we propose a programmable network measurement approach to address the challenges in automatic measurement and troubleshooting. We design a control plane to learn from historical measurement results to build a graph of available measurement hosts and their routes. On the top of such a control plane, We also present a set of APIs to allow network operators define measurement tasks programmatically and initiate the measurement to locate problematic links automatically. The measurement control plane is implemented in 300 lines of Python code. We show the use cases of the proposed APIs where we can locate problematic network link(s) in 15 minutes with less than 10 lines of Python code running on perfSONAR infrastructure, compared to hours with a conventional troubleshooting approach.","PeriodicalId":137948,"journal":{"name":"2016 IEEE 37th Sarnoff Symposium","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 37th Sarnoff Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SARNOF.2016.7846743","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Networks have been expanding in scale and speeds, however, it is difficult to troubleshoot network problems because of specific measurement policies and services in different administrative domains. Moreover, many network issues are very subtle, e.g. a link becomes increasing slow but still connected, where active measurement is instrumental. While many measurement infrastructures have been developed and used, the measurement and troubleshooting process typically requires human intervention and leads to inefficiency. In this work, we propose a programmable network measurement approach to address the challenges in automatic measurement and troubleshooting. We design a control plane to learn from historical measurement results to build a graph of available measurement hosts and their routes. On the top of such a control plane, We also present a set of APIs to allow network operators define measurement tasks programmatically and initiate the measurement to locate problematic links automatically. The measurement control plane is implemented in 300 lines of Python code. We show the use cases of the proposed APIs where we can locate problematic network link(s) in 15 minutes with less than 10 lines of Python code running on perfSONAR infrastructure, compared to hours with a conventional troubleshooting approach.