{"title":"Failure handling in an optimized two-safe approach to maintaining primary-backup systems","authors":"Kexiang Hu, S. Mehrotra, S. Kaplan","doi":"10.1109/RELDIS.1998.740488","DOIUrl":null,"url":null,"abstract":"In a primary backup database system, transaction processing takes place at the primary and the log records generated are propagated to the backup which uses them to reconstruct the database state at the primary. If the primary fails, the backup takes over to provide continued service. Most existing designs of primary backup database systems have concentrated on techniques to tolerate complete failures in which the entire primary fails, say due to a disaster. In multiprocessor environments, where the primary and the backup databases are partitioned across multiple computers, a more common case is a partial failure in which some database partitions fail but the system as a whole survives. Existing approaches either ignore partial failures, or require the failed database partition to be unavailable. We explore a design of the primary backup database system that uses the backup not only for disaster protection, but also for continued availability during partial failures. The approach is developed in the context of the improved optimized 2-safe strategy to transmitting logs from the primary to the backup, introduced by K. Hu et al. (1997), which combines the best features of the previously developed 1-safe and 2-safe strategies.","PeriodicalId":376253,"journal":{"name":"Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RELDIS.1998.740488","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
In a primary backup database system, transaction processing takes place at the primary and the log records generated are propagated to the backup which uses them to reconstruct the database state at the primary. If the primary fails, the backup takes over to provide continued service. Most existing designs of primary backup database systems have concentrated on techniques to tolerate complete failures in which the entire primary fails, say due to a disaster. In multiprocessor environments, where the primary and the backup databases are partitioned across multiple computers, a more common case is a partial failure in which some database partitions fail but the system as a whole survives. Existing approaches either ignore partial failures, or require the failed database partition to be unavailable. We explore a design of the primary backup database system that uses the backup not only for disaster protection, but also for continued availability during partial failures. The approach is developed in the context of the improved optimized 2-safe strategy to transmitting logs from the primary to the backup, introduced by K. Hu et al. (1997), which combines the best features of the previously developed 1-safe and 2-safe strategies.