Kailong Zhang, Ke Liang, Xingshe Zhou, Kaibo Wang, Xiao Wu, Zhiyi Yang
{"title":"A Similar Resource Auto-Discovery Based Adaptive Fault-tolerance Method for Embedded Distributed System","authors":"Kailong Zhang, Ke Liang, Xingshe Zhou, Kaibo Wang, Xiao Wu, Zhiyi Yang","doi":"10.1109/ICPPW.2007.15","DOIUrl":null,"url":null,"abstract":"Because of the resource constraints and high reliability requirement of Embedded Distributed System (EDS), some new fault-tolerance means, which are different from the traditional hardware- redundancy ones, should be studied. In this article, a fault-tolerance method that based on similar resources and related technologies are proposed and discussed. First, several mathematical models of key elements, such as computing nodes, similar nodes and tasks, are constructed. Then, the similarity computation methods and evaluation criteria are evinced by two different views: tasks and resources. Supported by theories above, numerous methods, such as similar nodes auto- discovery (SNAD) and its optimization one (oSNAD), redundant tasks auto-deployment, and reconfiguration policies of fault tasks and nodes are highlighted respectively. Simulation results show that these approaches and schemes can improve the adaptive fault-tolerance abilities of complicated embedded distributed systems.","PeriodicalId":367703,"journal":{"name":"2007 International Conference on Parallel Processing Workshops (ICPPW 2007)","volume":"24 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 International Conference on Parallel Processing Workshops (ICPPW 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPPW.2007.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Because of the resource constraints and high reliability requirement of Embedded Distributed System (EDS), some new fault-tolerance means, which are different from the traditional hardware- redundancy ones, should be studied. In this article, a fault-tolerance method that based on similar resources and related technologies are proposed and discussed. First, several mathematical models of key elements, such as computing nodes, similar nodes and tasks, are constructed. Then, the similarity computation methods and evaluation criteria are evinced by two different views: tasks and resources. Supported by theories above, numerous methods, such as similar nodes auto- discovery (SNAD) and its optimization one (oSNAD), redundant tasks auto-deployment, and reconfiguration policies of fault tasks and nodes are highlighted respectively. Simulation results show that these approaches and schemes can improve the adaptive fault-tolerance abilities of complicated embedded distributed systems.