Defragmenting DHT-based Distributed File Systems

Jeffrey Pang, Phillip B. Gibbons, M. Kaminsky, S. Seshan, Haifeng Yu
{"title":"Defragmenting DHT-based Distributed File Systems","authors":"Jeffrey Pang, Phillip B. Gibbons, M. Kaminsky, S. Seshan, Haifeng Yu","doi":"10.1109/ICDCS.2007.97","DOIUrl":null,"url":null,"abstract":"Existing DHT-based file systems use consistent hashing to assign file blocks to random machines. As a result, a user task accessing an entire file or multiple files needs to retrieve blocks from many different machines. This paper demonstrates that significant availability and performance gains can be achieved if instead, users are able to retrieve all the data needed for a given task from only a few DHT nodes. We explore the design and implications of such a \"defragmented\" DHT-based distributed file system, called D2, that also maintains important DHT properties like storage load balance. We show using real-world file system traces that a simple key encoding scheme is sufficient to maintain good defragmentation for most user tasks. Using both simulation and an actual 1,000 node deployment, we show that D2 increases availability by over an order of magnitude and improves user-perceived latency by 30- 100% compared to a traditional design.","PeriodicalId":170317,"journal":{"name":"27th International Conference on Distributed Computing Systems (ICDCS '07)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"27th International Conference on Distributed Computing Systems (ICDCS '07)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS.2007.97","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

Existing DHT-based file systems use consistent hashing to assign file blocks to random machines. As a result, a user task accessing an entire file or multiple files needs to retrieve blocks from many different machines. This paper demonstrates that significant availability and performance gains can be achieved if instead, users are able to retrieve all the data needed for a given task from only a few DHT nodes. We explore the design and implications of such a "defragmented" DHT-based distributed file system, called D2, that also maintains important DHT properties like storage load balance. We show using real-world file system traces that a simple key encoding scheme is sufficient to maintain good defragmentation for most user tasks. Using both simulation and an actual 1,000 node deployment, we show that D2 increases availability by over an order of magnitude and improves user-perceived latency by 30- 100% compared to a traditional design.
整理基于dht的分布式文件系统
现有的基于dht的文件系统使用一致性哈希将文件块分配给随机机器。因此,访问整个文件或多个文件的用户任务需要从许多不同的机器检索块。本文证明,如果用户能够仅从少数DHT节点检索给定任务所需的所有数据,则可以实现显著的可用性和性能提升。我们将探讨这种基于DHT的“碎片整理”分布式文件系统(称为D2)的设计和含义,该系统还维护重要的DHT属性,如存储负载平衡。我们使用真实的文件系统跟踪显示,一个简单的密钥编码方案足以为大多数用户任务保持良好的碎片整理。通过模拟和实际的1,000个节点部署,我们发现与传统设计相比,D2将可用性提高了一个数量级以上,并将用户感知的延迟提高了30- 100%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信