A Layered Architecture for Erasure-Coded Consistent Distributed Storage

K. Konwar, N. Prakash, N. Lynch, M. Médard
{"title":"A Layered Architecture for Erasure-Coded Consistent Distributed Storage","authors":"K. Konwar, N. Prakash, N. Lynch, M. Médard","doi":"10.1145/3087801.3087832","DOIUrl":null,"url":null,"abstract":"Motivated by emerging applications to the edge computing paradigm, we introduce a two-layer erasure-coded fault-tolerant distributed storage system offering atomic access for read and write operations. In edge computing, clients interact with an edge-layer of servers that is geographically near; the edge-layer in turn interacts with a back-end layer of servers. The edge-layer provides low latency access and temporary storage for client operations, and uses the back-end layer for persistent storage. Our algorithm, termed Layered Data Storage (LDS) algorithm, offers several features suitable for edge-computing systems, works under asynchronous message-passing environments, supports multiple readers and writers, and can tolerate f1 < n1/2 and f2 < n2/3 crash failures in the two layers having n1 and n2 servers, respectively. We use a class of erasure codes known as regenerating codes for storage of data in the back-end layer. The choice of regenerating codes, instead of popular choices like Reed-Solomon codes, not only optimizes the cost of back-end storage, but also helps in optimizing communication cost of read operations, when the value needs to be recreated all the way from the back-end. The two-layer architecture permits a modular implementation of atomicity and erasure-code protocols; the implementation of erasure-codes is mostly limited to interaction between the two layers. We prove liveness and atomicity of LDS, and also compute performance costs associated with read and write operations. In a system with n1 = Θ(n2), f1 = Θ(n1), f2 = Θ(n2), the write and read costs are respectively given by Θ(n1) and Θ(1) + n1 I(δ > 0). Here δ is a parameter closely related to the number of write operations that are concurrent with the read operation, and I(δ > 0) is 1 if δ > 0, and 0 if δ = 0. The cost of persistent storage in the back-end layer is Θ(1). The impact of temporary storage is minimally felt in a multi-object system running N independent instances of LDS, where only a small fraction of the objects undergo concurrent accesses at any point during the execution. For the multi-object system, we identify a condition on the rate of concurrent writes in the system such that the overall storage cost is dominated by that of persistent storage in the back-end layer, and is given by Θ(N).","PeriodicalId":324970,"journal":{"name":"Proceedings of the ACM Symposium on Principles of Distributed Computing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Symposium on Principles of Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3087801.3087832","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

Motivated by emerging applications to the edge computing paradigm, we introduce a two-layer erasure-coded fault-tolerant distributed storage system offering atomic access for read and write operations. In edge computing, clients interact with an edge-layer of servers that is geographically near; the edge-layer in turn interacts with a back-end layer of servers. The edge-layer provides low latency access and temporary storage for client operations, and uses the back-end layer for persistent storage. Our algorithm, termed Layered Data Storage (LDS) algorithm, offers several features suitable for edge-computing systems, works under asynchronous message-passing environments, supports multiple readers and writers, and can tolerate f1 < n1/2 and f2 < n2/3 crash failures in the two layers having n1 and n2 servers, respectively. We use a class of erasure codes known as regenerating codes for storage of data in the back-end layer. The choice of regenerating codes, instead of popular choices like Reed-Solomon codes, not only optimizes the cost of back-end storage, but also helps in optimizing communication cost of read operations, when the value needs to be recreated all the way from the back-end. The two-layer architecture permits a modular implementation of atomicity and erasure-code protocols; the implementation of erasure-codes is mostly limited to interaction between the two layers. We prove liveness and atomicity of LDS, and also compute performance costs associated with read and write operations. In a system with n1 = Θ(n2), f1 = Θ(n1), f2 = Θ(n2), the write and read costs are respectively given by Θ(n1) and Θ(1) + n1 I(δ > 0). Here δ is a parameter closely related to the number of write operations that are concurrent with the read operation, and I(δ > 0) is 1 if δ > 0, and 0 if δ = 0. The cost of persistent storage in the back-end layer is Θ(1). The impact of temporary storage is minimally felt in a multi-object system running N independent instances of LDS, where only a small fraction of the objects undergo concurrent accesses at any point during the execution. For the multi-object system, we identify a condition on the rate of concurrent writes in the system such that the overall storage cost is dominated by that of persistent storage in the back-end layer, and is given by Θ(N).
擦除编码一致分布式存储的分层体系结构
在边缘计算范例新兴应用的推动下,我们引入了一种两层擦除编码容错分布式存储系统,为读写操作提供原子访问。在边缘计算中,客户端与地理位置较近的服务器边缘层交互;边缘层依次与服务器的后端层进行交互。边缘层为客户端操作提供低延迟访问和临时存储,后端层用于持久存储。我们的算法被称为分层数据存储(LDS)算法,它提供了一些适合边缘计算系统的特性,可以在异步消息传递环境下工作,支持多个读取器和写入器,并且可以在分别具有n1和n2服务器的两层中容忍f1 < n1/2和f2 < n2/3的崩溃故障。我们在后端层使用一类称为再生码的擦除码来存储数据。选择重新生成代码,而不是像Reed-Solomon代码这样的流行选择,不仅优化了后端存储的成本,而且还有助于优化读取操作的通信成本,因为需要从后端一直重新创建值。两层架构允许原子性和擦除代码协议的模块化实现;擦除码的实现主要局限于两层之间的交互。我们证明了LDS的活动性和原子性,并计算了与读写操作相关的性能成本。在n1 = Θ(n2), f1 = Θ(n1), f2 = Θ(n2)的系统中,写和读的成本分别由Θ(n1)和Θ(1) + n1 I(δ > 0)给出。这里δ是一个与读操作并发的写操作数密切相关的参数,当δ > 0时I(δ > 0)为1,当δ = 0时I(δ > 0)为0。后端层持久存储的成本为Θ(1)。在运行N个独立LDS实例的多对象系统中,临时存储的影响最小,其中在执行期间的任何时刻只有一小部分对象进行并发访问。对于多对象系统,我们确定了系统中并发写速率的一个条件,使得总体存储成本由后端层的持久存储成本主导,该条件由Θ(N)给出。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信