{"title":"替换:复制状态机的分散故障处理","authors":"Leander Jehl, T. E. Lea, H. Meling","doi":"10.1109/SRDS.2015.29","DOIUrl":null,"url":null,"abstract":"We investigate methods for handling failures in a Paxos State Machine and introduce Replacement, a novel approach to handle failures. Replacement is fully decentralized and does not rely on consensus. This allows failed replicas to be replaced quickly, avoiding the bottleneck of a single leader. Instead of handling failures in the order proposed by a leader, concurrent replacements are combined to guarantee that all failed replicas are replaced. Replacement also allows the state machine to process client requests during failure handling, even while disagreeing on the current configuration. As our evaluation shows, this enables Replacement to quickly handle failures, with minimal disruption in the processing of client requests.","PeriodicalId":244925,"journal":{"name":"2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Replacement: Decentralized Failure Handling for Replicated State Machines\",\"authors\":\"Leander Jehl, T. E. Lea, H. Meling\",\"doi\":\"10.1109/SRDS.2015.29\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We investigate methods for handling failures in a Paxos State Machine and introduce Replacement, a novel approach to handle failures. Replacement is fully decentralized and does not rely on consensus. This allows failed replicas to be replaced quickly, avoiding the bottleneck of a single leader. Instead of handling failures in the order proposed by a leader, concurrent replacements are combined to guarantee that all failed replicas are replaced. Replacement also allows the state machine to process client requests during failure handling, even while disagreeing on the current configuration. As our evaluation shows, this enables Replacement to quickly handle failures, with minimal disruption in the processing of client requests.\",\"PeriodicalId\":244925,\"journal\":{\"name\":\"2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS)\",\"volume\":\"108 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SRDS.2015.29\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SRDS.2015.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Replacement: Decentralized Failure Handling for Replicated State Machines
We investigate methods for handling failures in a Paxos State Machine and introduce Replacement, a novel approach to handle failures. Replacement is fully decentralized and does not rely on consensus. This allows failed replicas to be replaced quickly, avoiding the bottleneck of a single leader. Instead of handling failures in the order proposed by a leader, concurrent replacements are combined to guarantee that all failed replicas are replaced. Replacement also allows the state machine to process client requests during failure handling, even while disagreeing on the current configuration. As our evaluation shows, this enables Replacement to quickly handle failures, with minimal disruption in the processing of client requests.