{"title":"同步系统中灵活、成本效益高的会员协议","authors":"R. Barbosa, J. Karlsson","doi":"10.1109/PRDC.2006.36","DOIUrl":null,"url":null,"abstract":"This paper presents a processor group membership protocol for fault-tolerant distributed real-time systems that utilize periodic, time-triggered scheduling for sending messages over the system's communication network. The protocol allows fault-free nodes to reach agreement on the operational state of all nodes in the presence of fail-silent or fail-reporting node failures as well as network failures (lost or corrupted messages). The protocol is based on the principle that each message sent by a node in the membership is acknowledged by k other nodes in a system of n nodes, where k can be set to any number between 2 and n - 1. Agreement on node failure (membership departure) and agreement on node recovery (membership reintegration) are handled by two different mechanisms. Agreement on departure is guaranteed if no more than f = k - 1 failures occur in the same communication round, while at most one node can be reintegrated into the membership per communication round","PeriodicalId":314915,"journal":{"name":"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Flexible, Cost-EffectiveMembership Agreement in Synchronous Systems\",\"authors\":\"R. Barbosa, J. Karlsson\",\"doi\":\"10.1109/PRDC.2006.36\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a processor group membership protocol for fault-tolerant distributed real-time systems that utilize periodic, time-triggered scheduling for sending messages over the system's communication network. The protocol allows fault-free nodes to reach agreement on the operational state of all nodes in the presence of fail-silent or fail-reporting node failures as well as network failures (lost or corrupted messages). The protocol is based on the principle that each message sent by a node in the membership is acknowledged by k other nodes in a system of n nodes, where k can be set to any number between 2 and n - 1. Agreement on node failure (membership departure) and agreement on node recovery (membership reintegration) are handled by two different mechanisms. Agreement on departure is guaranteed if no more than f = k - 1 failures occur in the same communication round, while at most one node can be reintegrated into the membership per communication round\",\"PeriodicalId\":314915,\"journal\":{\"name\":\"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PRDC.2006.36\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PRDC.2006.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
摘要
本文提出了一种容错分布式实时系统的处理器组成员协议,该协议利用周期性的、时间触发的调度在系统通信网络上发送消息。该协议允许无故障节点在存在故障沉默或故障报告节点故障以及网络故障(丢失或损坏的消息)的情况下就所有节点的操作状态达成一致。该协议基于这样的原则:在一个包含n个节点的系统中,成员中一个节点发送的每条消息都得到k个其他节点的确认,其中k可以设置为2到n - 1之间的任意数字。节点故障协议(成员退出)和节点恢复协议(成员重新整合)由两种不同的机制处理。如果在同一通信回合中不超过f = k - 1次失败,则保证离开协议,而每通信回合最多可以将一个节点重新整合到成员中
Flexible, Cost-EffectiveMembership Agreement in Synchronous Systems
This paper presents a processor group membership protocol for fault-tolerant distributed real-time systems that utilize periodic, time-triggered scheduling for sending messages over the system's communication network. The protocol allows fault-free nodes to reach agreement on the operational state of all nodes in the presence of fail-silent or fail-reporting node failures as well as network failures (lost or corrupted messages). The protocol is based on the principle that each message sent by a node in the membership is acknowledged by k other nodes in a system of n nodes, where k can be set to any number between 2 and n - 1. Agreement on node failure (membership departure) and agreement on node recovery (membership reintegration) are handled by two different mechanisms. Agreement on departure is guaranteed if no more than f = k - 1 failures occur in the same communication round, while at most one node can be reintegrated into the membership per communication round