擦除编码存储系统的延迟建模与优化

IF 2 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
V. Aggarwal, Tian Lan
{"title":"擦除编码存储系统的延迟建模与优化","authors":"V. Aggarwal, Tian Lan","doi":"10.1561/0100000108","DOIUrl":null,"url":null,"abstract":"As consumers are increasingly engaged in social networking and E-commerce activities, businesses grow to rely on Big Data analytics for intelligence, and traditional IT infrastructures continue to migrate to the cloud and edge, these trends cause distributed data storage demand to rise at an unprecedented speed. Erasure coding has seen itself quickly emerged as a promising technique to reduce storage cost while providing similar reliability as replicated systems, widely adopted by companies like Facebook, Microsoft and Google. However, it also brings new challenges in characterizing and optimizing the access latency when erasure codes are used in distributed storage. The aim of this monograph is to provide a review of recent progress (both theoretical and practical) on systems that employ erasure codes for distributed storage. \nIn this monograph, we will first identify the key challenges and taxonomy of the research problems and then give an overview of different approaches that have been developed to quantify and model latency of erasure-coded storage. This includes recent work leveraging MDS-Reservation, Fork-Join, Probabilistic, and Delayed-Relaunch scheduling policies, as well as their applications to characterize access latency (e.g., mean, tail, asymptotic latency) of erasure-coded distributed storage systems. We will also extend the problem to the case when users are streaming videos from erasure-coded distributed storage systems. Next, we bridge the gap between theory and practice, and discuss lessons learned from prototype implementation. In particular, we will discuss exemplary implementations of erasure-coded storage, illuminate key design degrees of freedom and tradeoffs, and summarize remaining challenges in real-world storage systems such as in content delivery and caching. Open problems for future research are discussed at the end of each chapter.","PeriodicalId":45236,"journal":{"name":"Foundations and Trends in Communications and Information Theory","volume":"11 1","pages":"380-525"},"PeriodicalIF":2.0000,"publicationDate":"2020-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Modeling and Optimization of Latency in Erasure-coded Storage Systems\",\"authors\":\"V. Aggarwal, Tian Lan\",\"doi\":\"10.1561/0100000108\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As consumers are increasingly engaged in social networking and E-commerce activities, businesses grow to rely on Big Data analytics for intelligence, and traditional IT infrastructures continue to migrate to the cloud and edge, these trends cause distributed data storage demand to rise at an unprecedented speed. Erasure coding has seen itself quickly emerged as a promising technique to reduce storage cost while providing similar reliability as replicated systems, widely adopted by companies like Facebook, Microsoft and Google. However, it also brings new challenges in characterizing and optimizing the access latency when erasure codes are used in distributed storage. The aim of this monograph is to provide a review of recent progress (both theoretical and practical) on systems that employ erasure codes for distributed storage. \\nIn this monograph, we will first identify the key challenges and taxonomy of the research problems and then give an overview of different approaches that have been developed to quantify and model latency of erasure-coded storage. This includes recent work leveraging MDS-Reservation, Fork-Join, Probabilistic, and Delayed-Relaunch scheduling policies, as well as their applications to characterize access latency (e.g., mean, tail, asymptotic latency) of erasure-coded distributed storage systems. We will also extend the problem to the case when users are streaming videos from erasure-coded distributed storage systems. Next, we bridge the gap between theory and practice, and discuss lessons learned from prototype implementation. In particular, we will discuss exemplary implementations of erasure-coded storage, illuminate key design degrees of freedom and tradeoffs, and summarize remaining challenges in real-world storage systems such as in content delivery and caching. Open problems for future research are discussed at the end of each chapter.\",\"PeriodicalId\":45236,\"journal\":{\"name\":\"Foundations and Trends in Communications and Information Theory\",\"volume\":\"11 1\",\"pages\":\"380-525\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2020-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Foundations and Trends in Communications and Information Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1561/0100000108\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations and Trends in Communications and Information Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1561/0100000108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 2

摘要

随着消费者越来越多地参与社交网络和电子商务活动,企业越来越依赖大数据分析来获取智能,传统IT基础设施不断向云和边缘迁移,这些趋势导致分布式数据存储需求以前所未有的速度增长。擦除编码迅速成为一种很有前途的技术,它可以降低存储成本,同时提供与复制系统类似的可靠性,被Facebook、微软和谷歌等公司广泛采用。然而,当擦除码应用于分布式存储时,对访问延迟的表征和优化也带来了新的挑战。本专著的目的是提供最近的进展(理论和实践)的系统,采用擦除码分布式存储的审查。在本专著中,我们将首先确定研究问题的主要挑战和分类,然后概述已经开发的用于量化和建模擦除编码存储延迟的不同方法。这包括最近利用MDS-Reservation、Fork-Join、Probabilistic和Delayed-Relaunch调度策略的工作,以及它们的应用程序来表征擦除编码分布式存储系统的访问延迟(例如,平均、尾部、渐近延迟)。我们还将把问题扩展到用户从擦除编码分布式存储系统流式传输视频的情况。接下来,我们将弥合理论与实践之间的差距,并讨论从原型实现中吸取的经验教训。特别是,我们将讨论擦除编码存储的示例实现,阐明关键的设计自由度和权衡,并总结现实存储系统(如内容交付和缓存)中仍然存在的挑战。每一章的最后都讨论了有待未来研究的开放性问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Modeling and Optimization of Latency in Erasure-coded Storage Systems
As consumers are increasingly engaged in social networking and E-commerce activities, businesses grow to rely on Big Data analytics for intelligence, and traditional IT infrastructures continue to migrate to the cloud and edge, these trends cause distributed data storage demand to rise at an unprecedented speed. Erasure coding has seen itself quickly emerged as a promising technique to reduce storage cost while providing similar reliability as replicated systems, widely adopted by companies like Facebook, Microsoft and Google. However, it also brings new challenges in characterizing and optimizing the access latency when erasure codes are used in distributed storage. The aim of this monograph is to provide a review of recent progress (both theoretical and practical) on systems that employ erasure codes for distributed storage. In this monograph, we will first identify the key challenges and taxonomy of the research problems and then give an overview of different approaches that have been developed to quantify and model latency of erasure-coded storage. This includes recent work leveraging MDS-Reservation, Fork-Join, Probabilistic, and Delayed-Relaunch scheduling policies, as well as their applications to characterize access latency (e.g., mean, tail, asymptotic latency) of erasure-coded distributed storage systems. We will also extend the problem to the case when users are streaming videos from erasure-coded distributed storage systems. Next, we bridge the gap between theory and practice, and discuss lessons learned from prototype implementation. In particular, we will discuss exemplary implementations of erasure-coded storage, illuminate key design degrees of freedom and tradeoffs, and summarize remaining challenges in real-world storage systems such as in content delivery and caching. Open problems for future research are discussed at the end of each chapter.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Foundations and Trends in Communications and Information Theory
Foundations and Trends in Communications and Information Theory COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-
CiteScore
7.90
自引率
0.00%
发文量
6
期刊介绍: Foundations and Trends® in Communications and Information Theory publishes survey and tutorial articles in the following topics: - Coded modulation - Coding theory and practice - Communication complexity - Communication system design - Cryptology and data security - Data compression - Data networks - Demodulation and Equalization - Denoising - Detection and estimation - Information theory and statistics - Information theory and computer science - Joint source/channel coding - Modulation and signal design - Multiuser detection - Multiuser information theory
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信