{"title":"动态分布式系统中的分散局部故障检测","authors":"Nigamanth Sridhar","doi":"10.1109/SRDS.2006.16","DOIUrl":null,"url":null,"abstract":"A failure detector is an important building block when constructing fault-tolerant distributed systems. In asynchronous distributed systems, failed processes are often indistinguishable from slow processes. A failure detector is an oracle that can intelligently suspect processes to have failed. Different classes of failure detectors have been proposed to solve different kinds of problems. Almost all of this work is focused on global failure detection, and moreover, in systems that do not contain mobile nodes or include dynamic topologies. In this paper, we present diamPm l - a local failure detector that can tolerate mobility and topology changes. This means that diamPm l can distinguish between a failed process and a process that has moved away from its original location. We also establish an upper bound on the duration for which a process wrongly suspects a node that has moved away from its neighborhood. We support our theoretical results with experimental findings from an implementation of this algorithm for sensor networks","PeriodicalId":164765,"journal":{"name":"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":"{\"title\":\"Decentralized Local Failure Detection in Dynamic Distributed Systems\",\"authors\":\"Nigamanth Sridhar\",\"doi\":\"10.1109/SRDS.2006.16\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A failure detector is an important building block when constructing fault-tolerant distributed systems. In asynchronous distributed systems, failed processes are often indistinguishable from slow processes. A failure detector is an oracle that can intelligently suspect processes to have failed. Different classes of failure detectors have been proposed to solve different kinds of problems. Almost all of this work is focused on global failure detection, and moreover, in systems that do not contain mobile nodes or include dynamic topologies. In this paper, we present diamPm l - a local failure detector that can tolerate mobility and topology changes. This means that diamPm l can distinguish between a failed process and a process that has moved away from its original location. We also establish an upper bound on the duration for which a process wrongly suspects a node that has moved away from its neighborhood. We support our theoretical results with experimental findings from an implementation of this algorithm for sensor networks\",\"PeriodicalId\":164765,\"journal\":{\"name\":\"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"37\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SRDS.2006.16\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SRDS.2006.16","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Decentralized Local Failure Detection in Dynamic Distributed Systems
A failure detector is an important building block when constructing fault-tolerant distributed systems. In asynchronous distributed systems, failed processes are often indistinguishable from slow processes. A failure detector is an oracle that can intelligently suspect processes to have failed. Different classes of failure detectors have been proposed to solve different kinds of problems. Almost all of this work is focused on global failure detection, and moreover, in systems that do not contain mobile nodes or include dynamic topologies. In this paper, we present diamPm l - a local failure detector that can tolerate mobility and topology changes. This means that diamPm l can distinguish between a failed process and a process that has moved away from its original location. We also establish an upper bound on the duration for which a process wrongly suspects a node that has moved away from its neighborhood. We support our theoretical results with experimental findings from an implementation of this algorithm for sensor networks