{"title":"利用局部故障块为三维环形网络嵌入高效容错路径","authors":"Weibei Fan;Fu Xiao;Mengjie Lv;Lei Han;Shui Yu","doi":"10.1109/TC.2024.3416695","DOIUrl":null,"url":null,"abstract":"3D tori are significant interconnection architectures in building supercomputers and parallel computing systems. Due to the rapid growth of edge faults and the crucial role of path structures in large-scale distributed systems, fault-tolerant path embedding and correlated issues have drawn widespread researches. However, existing path embedding methods are based on traditional fault models, allowing all faults to be near the same node, so they usually only focus on theoretical proof and generate linear fault-tolerance related to dimension \n<inline-formula><tex-math>$n$</tex-math></inline-formula>\n. In order to improve the fault-tolerance of 3D torus, we first propose a novel conditional fault model called the Locally Faulty Block model (LFB model). On the basis of this model, the Hamiltonian paths with large-scale edge defects in torus are investigated. After that, we construct an Hamiltonian path embedding algorithm HP-LFB into torus with \n<inline-formula><tex-math>$O(N)$</tex-math></inline-formula>\n under the LFB model, where \n<inline-formula><tex-math>$N$</tex-math></inline-formula>\n is the number of nodes in torus. Furthermore, we present an adaptive routing algorithm HoeFA, which is based on the method of distance vector to limit the use of virtual channels (VCs). We also make a comparison with state-of-the-art schemes, indicating that our scheme enhance other comprehensive results. The experiment indicated that HP-LFB can sustain the dynamic degradation of the batting average of establishing Hamiltonian paths, with the added faulty edges exceeding fault-tolerance.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 9","pages":"2305-2319"},"PeriodicalIF":3.6000,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Fault-Tolerant Path Embedding for 3D Torus Network Using Locally Faulty Blocks\",\"authors\":\"Weibei Fan;Fu Xiao;Mengjie Lv;Lei Han;Shui Yu\",\"doi\":\"10.1109/TC.2024.3416695\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"3D tori are significant interconnection architectures in building supercomputers and parallel computing systems. Due to the rapid growth of edge faults and the crucial role of path structures in large-scale distributed systems, fault-tolerant path embedding and correlated issues have drawn widespread researches. However, existing path embedding methods are based on traditional fault models, allowing all faults to be near the same node, so they usually only focus on theoretical proof and generate linear fault-tolerance related to dimension \\n<inline-formula><tex-math>$n$</tex-math></inline-formula>\\n. In order to improve the fault-tolerance of 3D torus, we first propose a novel conditional fault model called the Locally Faulty Block model (LFB model). On the basis of this model, the Hamiltonian paths with large-scale edge defects in torus are investigated. After that, we construct an Hamiltonian path embedding algorithm HP-LFB into torus with \\n<inline-formula><tex-math>$O(N)$</tex-math></inline-formula>\\n under the LFB model, where \\n<inline-formula><tex-math>$N$</tex-math></inline-formula>\\n is the number of nodes in torus. Furthermore, we present an adaptive routing algorithm HoeFA, which is based on the method of distance vector to limit the use of virtual channels (VCs). We also make a comparison with state-of-the-art schemes, indicating that our scheme enhance other comprehensive results. The experiment indicated that HP-LFB can sustain the dynamic degradation of the batting average of establishing Hamiltonian paths, with the added faulty edges exceeding fault-tolerance.\",\"PeriodicalId\":13087,\"journal\":{\"name\":\"IEEE Transactions on Computers\",\"volume\":\"73 9\",\"pages\":\"2305-2319\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computers\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10564834/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10564834/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Efficient Fault-Tolerant Path Embedding for 3D Torus Network Using Locally Faulty Blocks
3D tori are significant interconnection architectures in building supercomputers and parallel computing systems. Due to the rapid growth of edge faults and the crucial role of path structures in large-scale distributed systems, fault-tolerant path embedding and correlated issues have drawn widespread researches. However, existing path embedding methods are based on traditional fault models, allowing all faults to be near the same node, so they usually only focus on theoretical proof and generate linear fault-tolerance related to dimension
$n$
. In order to improve the fault-tolerance of 3D torus, we first propose a novel conditional fault model called the Locally Faulty Block model (LFB model). On the basis of this model, the Hamiltonian paths with large-scale edge defects in torus are investigated. After that, we construct an Hamiltonian path embedding algorithm HP-LFB into torus with
$O(N)$
under the LFB model, where
$N$
is the number of nodes in torus. Furthermore, we present an adaptive routing algorithm HoeFA, which is based on the method of distance vector to limit the use of virtual channels (VCs). We also make a comparison with state-of-the-art schemes, indicating that our scheme enhance other comprehensive results. The experiment indicated that HP-LFB can sustain the dynamic degradation of the batting average of establishing Hamiltonian paths, with the added faulty edges exceeding fault-tolerance.
期刊介绍:
The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.