{"title":"Adaptive and fault-tolerant routing with 100% node utilization for mesh multicomputer","authors":"Sheng-de Wang, Ming-Jer Tsai","doi":"10.1109/ICPADS.1998.741099","DOIUrl":null,"url":null,"abstract":"We propose an adaptive and deadlock-free routing algorithm to tolerate irregular faulty patterns using two virtual channels per physical link. It can improve the node utilization up to 100%. When a node becomes faulty or recovered, the central control unit constructs a directed path graph which is used for generating the intermediate nodes of the message path. Thus a message can be transmitted from sources or to destinations within faulty blocks via a set of \"intermediate nodes\". Our method requires the global failure information if the central control unit is not available.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPADS.1998.741099","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We propose an adaptive and deadlock-free routing algorithm to tolerate irregular faulty patterns using two virtual channels per physical link. It can improve the node utilization up to 100%. When a node becomes faulty or recovered, the central control unit constructs a directed path graph which is used for generating the intermediate nodes of the message path. Thus a message can be transmitted from sources or to destinations within faulty blocks via a set of "intermediate nodes". Our method requires the global failure information if the central control unit is not available.