{"title":"On staggered checkpointing","authors":"N. Vaidya","doi":"10.1109/SPDP.1996.570386","DOIUrl":null,"url":null,"abstract":"A consistent checkpointing algorithm serves a consistent view of a distributed application's state on stable storage. The traditional consistent checkpointing algorithms require different processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce checkpoint overhead. The paper presents a simple approach to arbitrarily stagger the checkpoints. The approach requires that the processes take consistent logical checkpoints, as compared to consistent physical checkpoints enforced by existing algorithms. Experimental results on nCube-2 are presented.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"244 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPDP.1996.570386","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
A consistent checkpointing algorithm serves a consistent view of a distributed application's state on stable storage. The traditional consistent checkpointing algorithms require different processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce checkpoint overhead. The paper presents a simple approach to arbitrarily stagger the checkpoints. The approach requires that the processes take consistent logical checkpoints, as compared to consistent physical checkpoints enforced by existing algorithms. Experimental results on nCube-2 are presented.