Predicting durability in DHTs using Markov chains

Fabio Picconi, B. Baynat, Pierre Sens
{"title":"Predicting durability in DHTs using Markov chains","authors":"Fabio Picconi, B. Baynat, Pierre Sens","doi":"10.1109/ICDIM.2007.4444278","DOIUrl":null,"url":null,"abstract":"We consider the problem of data durability in low-bandwidth large-scale distributed storage systems. Given the limited bandwidth between replicas, these systems suffer from long repair times after a hard disk crash, making them vulnerable to data loss when several replicas fail within a short period of time. Recent work has suggested that the probability of data loss can be predicted by modeling the number of live replicas using a Markov chain. This, in turn, can then be used to determine the number of replicas necessary to keep the loss probability under a given desired value. Previous authors have suggested that the model parameters can be estimated using an expression that is constant or linear on the number of replicas. Our simulations, however, show that neither is correct, as these parameter values grow sublinearly with the number of replicas. Moreover, we show that using a linear expression will result in the probability of data loss being underestimated, while the constant expression will produce a significant overestimation. Finally, we provide an empirical expression that yields a good approximation of the sublinear parameter values. Our work can be viewed as a first step towards finding more accurate models to predict the durability of this type of systems.","PeriodicalId":198626,"journal":{"name":"2007 2nd International Conference on Digital Information Management","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 2nd International Conference on Digital Information Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDIM.2007.4444278","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

We consider the problem of data durability in low-bandwidth large-scale distributed storage systems. Given the limited bandwidth between replicas, these systems suffer from long repair times after a hard disk crash, making them vulnerable to data loss when several replicas fail within a short period of time. Recent work has suggested that the probability of data loss can be predicted by modeling the number of live replicas using a Markov chain. This, in turn, can then be used to determine the number of replicas necessary to keep the loss probability under a given desired value. Previous authors have suggested that the model parameters can be estimated using an expression that is constant or linear on the number of replicas. Our simulations, however, show that neither is correct, as these parameter values grow sublinearly with the number of replicas. Moreover, we show that using a linear expression will result in the probability of data loss being underestimated, while the constant expression will produce a significant overestimation. Finally, we provide an empirical expression that yields a good approximation of the sublinear parameter values. Our work can be viewed as a first step towards finding more accurate models to predict the durability of this type of systems.
利用马尔可夫链预测dht的耐久性
研究了低带宽大规模分布式存储系统中的数据持久性问题。由于副本之间的带宽有限,这些系统在硬盘崩溃后需要很长的修复时间,当几个副本在短时间内失败时,它们很容易丢失数据。最近的研究表明,数据丢失的概率可以通过使用马尔可夫链对活副本的数量进行建模来预测。然后,这可以用来确定将损失概率保持在给定期望值以下所需的副本数量。以前的作者已经提出,模型参数可以使用对副本数量不变或线性的表达式来估计。然而,我们的模拟表明,这两种情况都不正确,因为这些参数值随着副本的数量呈次线性增长。此外,我们表明,使用线性表达式将导致数据丢失的概率被低估,而常数表达式将产生显著的高估。最后,我们提供了一个经验表达式,它产生了亚线性参数值的良好近似值。我们的工作可以被视为寻找更准确的模型来预测这类系统的耐久性的第一步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信