{"title":"Erasure Coding in Object Stores: Challenges and Opportunities","authors":"Lewis Tseng","doi":"10.1145/3212734.3212799","DOIUrl":null,"url":null,"abstract":"Recent years have seen a tremendous growth in the popularity of online services accessed over the Internet. Our daily lives are becoming more and more dependent on these online services, which generate and/or rely on huge amount of data. One core technique to deal with the unprecedented amount of data is the distributed storage systems that allow users/applications to read and write data in a distributed fashion and ensure fault-tolerance, durability, scalability, and availability. This tutorial will focus on the distributed key-value storage systems, i.e., read/write objects. One common implementation of such a read/write object is via replicating data across multiple servers or even data-centers. The replication-based implementation has been studied in the literature, e.g., ABD [Attiya, Bar-Noy and Dolev '96] and LDR [Fan and Lynch '03], and adopted in practice e.g., Cassandra, MongoDB, and DynamoDB. One drawbacks of the replication-based mechanism is high storage cost and communication cost due to unnecessary redundancy. To address the issue, there is an ongoing effort on applying erasure codes to distributed storage systems in both academia and industry. For example, Microsoft applies erasure coding across data-centers to build strongly consistent objects (Giza in Microsoft Azure Storage), and OpenStack provides erasure coding as a storage policy in their read/write object Swift. However, the field is still fairly young and has many interesting open problems. This tutorial will focus on the challenges of using erasure codes in read/write objects that guarantee consistency. To begin with, I will introduce concepts on consistency models, erasure codes followed by some recent algorithms and existing practical systems. I will then discuss the state-of-the-art techniques in this field, and conclude the talk with potential challenges that lead to interesting research problems. The talk will be accessible to anyone with a background a basic knowledge on algorithms or programming. First part of the results are done by Viveck Cadambe, Kishori Konwar, N. Prakash, Nancy Lynch, and Muriel Médard. In the end, I will also share our recent results.","PeriodicalId":198284,"journal":{"name":"Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3212734.3212799","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Recent years have seen a tremendous growth in the popularity of online services accessed over the Internet. Our daily lives are becoming more and more dependent on these online services, which generate and/or rely on huge amount of data. One core technique to deal with the unprecedented amount of data is the distributed storage systems that allow users/applications to read and write data in a distributed fashion and ensure fault-tolerance, durability, scalability, and availability. This tutorial will focus on the distributed key-value storage systems, i.e., read/write objects. One common implementation of such a read/write object is via replicating data across multiple servers or even data-centers. The replication-based implementation has been studied in the literature, e.g., ABD [Attiya, Bar-Noy and Dolev '96] and LDR [Fan and Lynch '03], and adopted in practice e.g., Cassandra, MongoDB, and DynamoDB. One drawbacks of the replication-based mechanism is high storage cost and communication cost due to unnecessary redundancy. To address the issue, there is an ongoing effort on applying erasure codes to distributed storage systems in both academia and industry. For example, Microsoft applies erasure coding across data-centers to build strongly consistent objects (Giza in Microsoft Azure Storage), and OpenStack provides erasure coding as a storage policy in their read/write object Swift. However, the field is still fairly young and has many interesting open problems. This tutorial will focus on the challenges of using erasure codes in read/write objects that guarantee consistency. To begin with, I will introduce concepts on consistency models, erasure codes followed by some recent algorithms and existing practical systems. I will then discuss the state-of-the-art techniques in this field, and conclude the talk with potential challenges that lead to interesting research problems. The talk will be accessible to anyone with a background a basic knowledge on algorithms or programming. First part of the results are done by Viveck Cadambe, Kishori Konwar, N. Prakash, Nancy Lynch, and Muriel Médard. In the end, I will also share our recent results.