{"title":"Distributed Snapshot algorithm for multi-active object-based applications","authors":"Michel Jackson de Souza, F. Baude","doi":"10.5753/wtf.2014.22947","DOIUrl":null,"url":null,"abstract":"This paper exposes an adaptation of the classic algorithm for consistent snapshot in distributed systems with asynchronous processes due to Chandy&Lamport. A snapshot in this context is described as the consistent set of states of all involved communicating processes that allows recovering the whole system after a crash. The reconstructed system state is consistent, even if messages injected into the system from the outside while the snapshot was ongoing may have been lost (if such messages can not be replayed). We expose how to adapt this algorithm to a particular distributed programming model, the Active Object model (in its multi active version). We applied it successfully to a non trivial distributed application programmed using Active Objects serving as a publish/subscribe and storage of events middleware, dubbed the EventCloud.","PeriodicalId":321409,"journal":{"name":"Anais do XV Workshop de Testes e Tolerância a Falhas (WTF 2014)","volume":"219 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anais do XV Workshop de Testes e Tolerância a Falhas (WTF 2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/wtf.2014.22947","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper exposes an adaptation of the classic algorithm for consistent snapshot in distributed systems with asynchronous processes due to Chandy&Lamport. A snapshot in this context is described as the consistent set of states of all involved communicating processes that allows recovering the whole system after a crash. The reconstructed system state is consistent, even if messages injected into the system from the outside while the snapshot was ongoing may have been lost (if such messages can not be replayed). We expose how to adapt this algorithm to a particular distributed programming model, the Active Object model (in its multi active version). We applied it successfully to a non trivial distributed application programmed using Active Objects serving as a publish/subscribe and storage of events middleware, dubbed the EventCloud.