{"title":"Concurrent rollback for crash recovery in extended hypercube networks","authors":"T. Juang, C. Chiu, Kun-Ming Yu","doi":"10.1109/AISPAS.1995.401336","DOIUrl":null,"url":null,"abstract":"Recovering from processor failures is an important problem in the design and development of reliable systems. We present a concurrent rollback algorithm in extended hypercube networks to recover from crash failures which involves small message and time complexities. The network of an extended hypercube is a hierarchical, low diameter, recursive structure. By appending only O(1) additional information to each message, we use less than O(Nlog N) message exchanges and O(log/sup 2/ N) time elapsed for recovery work where N is the number of processors of the extended hypercube network. The algorithms can be used to recover from the failure of an arbitrary number of processors.<<ETX>>","PeriodicalId":321580,"journal":{"name":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","volume":"537 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AISPAS.1995.401336","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recovering from processor failures is an important problem in the design and development of reliable systems. We present a concurrent rollback algorithm in extended hypercube networks to recover from crash failures which involves small message and time complexities. The network of an extended hypercube is a hierarchical, low diameter, recursive structure. By appending only O(1) additional information to each message, we use less than O(Nlog N) message exchanges and O(log/sup 2/ N) time elapsed for recovery work where N is the number of processors of the extended hypercube network. The algorithms can be used to recover from the failure of an arbitrary number of processors.<>