Zequn Jia;Qiang Liu;Ying He;Qianqian Wu;Ren Ping Liu;Yantao Sun
{"title":"Efficient end-to-end failure probing matrix construction in data center networks","authors":"Zequn Jia;Qiang Liu;Ying He;Qianqian Wu;Ren Ping Liu;Yantao Sun","doi":"10.23919/JCN.2023.000029","DOIUrl":null,"url":null,"abstract":"Data centers play an essential role in the functioning of modern society. However, failures are unavoidable in data center networks (DCN) and will lead to negative impact on all applications. Therefore, researchers are interested in the rapid detection and localization of failures in DCNs. In this paper, we present a theoretical model to analyze the end-to-end failure detection methods in data center networks. Our numerical results verify that the proposed theoretical model is accurate. In addition, we propose an algorithm to construct probing matrices based on an enhanced probing path selection indicator. We also introduce deep reinforcement learning (DRL) method to solve the problem and propose a DRL-based probing matrix construction algorithm. Our experimental results show that both of the proposed algorithms for constructing probing matrices achieve better performance in detection accuracy than existing methods. We discussed different scenarios that the algorithms are applicable to that can improve detection accuracy or construction speed performance.","PeriodicalId":54864,"journal":{"name":"Journal of Communications and Networks","volume":"25 4","pages":"532-543"},"PeriodicalIF":2.9000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/5449605/10251734/10251737.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Communications and Networks","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10251737/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Data centers play an essential role in the functioning of modern society. However, failures are unavoidable in data center networks (DCN) and will lead to negative impact on all applications. Therefore, researchers are interested in the rapid detection and localization of failures in DCNs. In this paper, we present a theoretical model to analyze the end-to-end failure detection methods in data center networks. Our numerical results verify that the proposed theoretical model is accurate. In addition, we propose an algorithm to construct probing matrices based on an enhanced probing path selection indicator. We also introduce deep reinforcement learning (DRL) method to solve the problem and propose a DRL-based probing matrix construction algorithm. Our experimental results show that both of the proposed algorithms for constructing probing matrices achieve better performance in detection accuracy than existing methods. We discussed different scenarios that the algorithms are applicable to that can improve detection accuracy or construction speed performance.
期刊介绍:
The JOURNAL OF COMMUNICATIONS AND NETWORKS is published six times per year, and is committed to publishing high-quality papers that advance the state-of-the-art and practical applications of communications and information networks. Theoretical research contributions presenting new techniques, concepts, or analyses, applied contributions reporting on experiences and experiments, and tutorial expositions of permanent reference value are welcome. The subjects covered by this journal include all topics in communication theory and techniques, communication systems, and information networks. COMMUNICATION THEORY AND SYSTEMS WIRELESS COMMUNICATIONS NETWORKS AND SERVICES.