{"title":"探讨P2P存储系统的成本-可用性权衡","authors":"Zhi Yang, Yafei Dai, Zhen Xiao","doi":"10.1109/ICPP.2009.46","DOIUrl":null,"url":null,"abstract":"P2P storage systems use replication to provide a certain level of availability. While the system must generate new replicas to replace replicas lost to permanent failures, it can save significant replication cost by not replicating following transient failures. However, in real systems, it is impossible to reliably distinguish permanent and transients failures, resulting in a tradeoff between high recovery cost and low data availability. In this paper, we analyze the use of timeouts as a mechanism to navigate this tradeoff. We address the challenging problem of how to choose a timeout to walk the fine line between causing unnecessary replication due to detection inaccuracy, and reducing availability due to detection delay. We conduct simulations based both on synthetic and real traces, and show that the performance of our selected timeout closely approximates the optimal performance that can be achieved by timeouts, and even that of an “oracle” failure detector.","PeriodicalId":169408,"journal":{"name":"2009 International Conference on Parallel Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Exploring the Cost-Availability Tradeoff in P2P Storage Systems\",\"authors\":\"Zhi Yang, Yafei Dai, Zhen Xiao\",\"doi\":\"10.1109/ICPP.2009.46\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"P2P storage systems use replication to provide a certain level of availability. While the system must generate new replicas to replace replicas lost to permanent failures, it can save significant replication cost by not replicating following transient failures. However, in real systems, it is impossible to reliably distinguish permanent and transients failures, resulting in a tradeoff between high recovery cost and low data availability. In this paper, we analyze the use of timeouts as a mechanism to navigate this tradeoff. We address the challenging problem of how to choose a timeout to walk the fine line between causing unnecessary replication due to detection inaccuracy, and reducing availability due to detection delay. We conduct simulations based both on synthetic and real traces, and show that the performance of our selected timeout closely approximates the optimal performance that can be achieved by timeouts, and even that of an “oracle” failure detector.\",\"PeriodicalId\":169408,\"journal\":{\"name\":\"2009 International Conference on Parallel Processing\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 International Conference on Parallel Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPP.2009.46\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2009.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Exploring the Cost-Availability Tradeoff in P2P Storage Systems
P2P storage systems use replication to provide a certain level of availability. While the system must generate new replicas to replace replicas lost to permanent failures, it can save significant replication cost by not replicating following transient failures. However, in real systems, it is impossible to reliably distinguish permanent and transients failures, resulting in a tradeoff between high recovery cost and low data availability. In this paper, we analyze the use of timeouts as a mechanism to navigate this tradeoff. We address the challenging problem of how to choose a timeout to walk the fine line between causing unnecessary replication due to detection inaccuracy, and reducing availability due to detection delay. We conduct simulations based both on synthetic and real traces, and show that the performance of our selected timeout closely approximates the optimal performance that can be achieved by timeouts, and even that of an “oracle” failure detector.