替换:复制状态机的分散故障处理

2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2015-09-28 DOI:10.1109/SRDS.2015.29

Leander Jehl, T. E. Lea, H. Meling

{"title":"替换:复制状态机的分散故障处理","authors":"Leander Jehl, T. E. Lea, H. Meling","doi":"10.1109/SRDS.2015.29","DOIUrl":null,"url":null,"abstract":"We investigate methods for handling failures in a Paxos State Machine and introduce Replacement, a novel approach to handle failures. Replacement is fully decentralized and does not rely on consensus. This allows failed replicas to be replaced quickly, avoiding the bottleneck of a single leader. Instead of handling failures in the order proposed by a leader, concurrent replacements are combined to guarantee that all failed replicas are replaced. Replacement also allows the state machine to process client requests during failure handling, even while disagreeing on the current configuration. As our evaluation shows, this enables Replacement to quickly handle failures, with minimal disruption in the processing of client requests.","PeriodicalId":244925,"journal":{"name":"2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Replacement: Decentralized Failure Handling for Replicated State Machines\",\"authors\":\"Leander Jehl, T. E. Lea, H. Meling\",\"doi\":\"10.1109/SRDS.2015.29\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We investigate methods for handling failures in a Paxos State Machine and introduce Replacement, a novel approach to handle failures. Replacement is fully decentralized and does not rely on consensus. This allows failed replicas to be replaced quickly, avoiding the bottleneck of a single leader. Instead of handling failures in the order proposed by a leader, concurrent replacements are combined to guarantee that all failed replicas are replaced. Replacement also allows the state machine to process client requests during failure handling, even while disagreeing on the current configuration. As our evaluation shows, this enables Replacement to quickly handle failures, with minimal disruption in the processing of client requests.\",\"PeriodicalId\":244925,\"journal\":{\"name\":\"2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS)\",\"volume\":\"108 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SRDS.2015.29\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SRDS.2015.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

我们研究了在Paxos状态机中处理故障的方法，并介绍了Replacement，一种处理故障的新方法。替代是完全去中心化的，不依赖于共识。这允许快速替换失败的副本，避免单个leader的瓶颈。而不是按照领导者建议的顺序处理故障，并发替换被组合起来以保证所有失败的副本都被替换。替换还允许状态机在故障处理期间处理客户机请求，即使在当前配置不一致时也是如此。正如我们的评估所显示的，这使得Replacement能够快速处理故障，在处理客户端请求时将中断降到最低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Replacement: Decentralized Failure Handling for Replicated State Machines

We investigate methods for handling failures in a Paxos State Machine and introduce Replacement, a novel approach to handle failures. Replacement is fully decentralized and does not rely on consensus. This allows failed replicas to be replaced quickly, avoiding the bottleneck of a single leader. Instead of handling failures in the order proposed by a leader, concurrent replacements are combined to guarantee that all failed replicas are replaced. Replacement also allows the state machine to process client requests during failure handling, even while disagreeing on the current configuration. As our evaluation shows, this enables Replacement to quickly handle failures, with minimal disruption in the processing of client requests.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS)

自引率

0.00%

发文量