异步易故障网络系统的自稳定快照对象

Chryssis Georgiou, Oskar Lundström, E. Schiller
{"title":"异步易故障网络系统的自稳定快照对象","authors":"Chryssis Georgiou, Oskar Lundström, E. Schiller","doi":"10.1145/3293611.3331584","DOIUrl":null,"url":null,"abstract":"A snapshot object simulates the behavior of an array of single-writer/multi-reader shared registers that can be read atomically. Delporte-Gallet et al. proposed two fault-tolerant algorithms for snapshot objects in asynchronous crash-prone message-passing systems. Their first algorithm is non-blocking; it allows snapshot operations to terminate once all write operations had ceased. It uses O(n) messages of O(n v) bits, where n is the number of nodes and v is the number of bits it takes to represent the object. Their second algorithm allows snapshot operations to always terminate independently of write operations. It incurs O(n^2) messages. The fault model of Delporte-Gallet et al. considers node failures (crashes). We aim at the design of even more robust snapshot objects. We do so through the lenses of self-stabilization---a very strong notion of fault-tolerance. In addition to Delporte-Gallet et al.'s fault model, a self-stabilizing algorithm can recover after the occurrence of transient faults; these faults represent arbitrary violations of the assumptions according to which the system was designed to operate (as long as the code stays intact). In particular, in this work, we propose self-stabilizing variations of Delporte-Gallet et al.'s non-blocking algorithm and always-terminating algorithm. Our algorithms have similar communication costs to the ones by Delporte-Gallet et al. and O(1) recovery time (in terms of asynchronous cycles) from transient faults. The main differences are that our proposal considers repeated gossiping of O(v) bits messages and deals with bounded space, which is a prerequisite for self-stabilization.","PeriodicalId":153766,"journal":{"name":"Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Self-Stabilizing Snapshot Objects for Asynchronous Failure-Prone Networked Systems\",\"authors\":\"Chryssis Georgiou, Oskar Lundström, E. Schiller\",\"doi\":\"10.1145/3293611.3331584\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A snapshot object simulates the behavior of an array of single-writer/multi-reader shared registers that can be read atomically. Delporte-Gallet et al. proposed two fault-tolerant algorithms for snapshot objects in asynchronous crash-prone message-passing systems. Their first algorithm is non-blocking; it allows snapshot operations to terminate once all write operations had ceased. It uses O(n) messages of O(n v) bits, where n is the number of nodes and v is the number of bits it takes to represent the object. Their second algorithm allows snapshot operations to always terminate independently of write operations. It incurs O(n^2) messages. The fault model of Delporte-Gallet et al. considers node failures (crashes). We aim at the design of even more robust snapshot objects. We do so through the lenses of self-stabilization---a very strong notion of fault-tolerance. In addition to Delporte-Gallet et al.'s fault model, a self-stabilizing algorithm can recover after the occurrence of transient faults; these faults represent arbitrary violations of the assumptions according to which the system was designed to operate (as long as the code stays intact). In particular, in this work, we propose self-stabilizing variations of Delporte-Gallet et al.'s non-blocking algorithm and always-terminating algorithm. Our algorithms have similar communication costs to the ones by Delporte-Gallet et al. and O(1) recovery time (in terms of asynchronous cycles) from transient faults. The main differences are that our proposal considers repeated gossiping of O(v) bits messages and deals with bounded space, which is a prerequisite for self-stabilization.\",\"PeriodicalId\":153766,\"journal\":{\"name\":\"Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3293611.3331584\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3293611.3331584","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

快照对象模拟可以自动读取的单写入器/多读取器共享寄存器数组的行为。Delporte-Gallet等人针对异步消息传递系统中容易崩溃的快照对象提出了两种容错算法。他们的第一个算法是非阻塞的;它允许快照操作在所有写操作停止后终止。它使用O(n v)位的O(n)条消息,其中n是节点的数量,v是表示对象所需的位数。他们的第二种算法允许快照操作总是独立于写操作而终止。它会产生O(n^2)个消息。Delporte-Gallet等人的故障模型考虑了节点故障(崩溃)。我们的目标是设计更健壮的快照对象。我们这样做是通过自我稳定的镜头——一个非常强大的容错概念。除Delporte-Gallet等人的故障模型外,还采用自稳定算法在发生暂态故障后进行恢复;这些错误表示任意违反了设计系统时所依据的假设(只要代码保持完整)。特别地,在这项工作中,我们提出了Delporte-Gallet等人的非阻塞算法和总是终止算法的自稳定变体。我们的算法与Delporte-Gallet等人的算法具有相似的通信成本,并且从瞬态故障中恢复时间(以异步周期而言)为0(1)。主要区别在于,我们的建议考虑了O(v)位消息的重复八卦,并处理有界空间,这是自稳定的先决条件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Self-Stabilizing Snapshot Objects for Asynchronous Failure-Prone Networked Systems
A snapshot object simulates the behavior of an array of single-writer/multi-reader shared registers that can be read atomically. Delporte-Gallet et al. proposed two fault-tolerant algorithms for snapshot objects in asynchronous crash-prone message-passing systems. Their first algorithm is non-blocking; it allows snapshot operations to terminate once all write operations had ceased. It uses O(n) messages of O(n v) bits, where n is the number of nodes and v is the number of bits it takes to represent the object. Their second algorithm allows snapshot operations to always terminate independently of write operations. It incurs O(n^2) messages. The fault model of Delporte-Gallet et al. considers node failures (crashes). We aim at the design of even more robust snapshot objects. We do so through the lenses of self-stabilization---a very strong notion of fault-tolerance. In addition to Delporte-Gallet et al.'s fault model, a self-stabilizing algorithm can recover after the occurrence of transient faults; these faults represent arbitrary violations of the assumptions according to which the system was designed to operate (as long as the code stays intact). In particular, in this work, we propose self-stabilizing variations of Delporte-Gallet et al.'s non-blocking algorithm and always-terminating algorithm. Our algorithms have similar communication costs to the ones by Delporte-Gallet et al. and O(1) recovery time (in terms of asynchronous cycles) from transient faults. The main differences are that our proposal considers repeated gossiping of O(v) bits messages and deals with bounded space, which is a prerequisite for self-stabilization.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信