EDRFS: An Effective Distributed Replication File System for Small-File and Data-Intensive Application

Bin Cai, C. Xie, Guangxi Zhu
{"title":"EDRFS: An Effective Distributed Replication File System for Small-File and Data-Intensive Application","authors":"Bin Cai, C. Xie, Guangxi Zhu","doi":"10.1109/COMSWA.2007.382422","DOIUrl":null,"url":null,"abstract":"With the system scale keeping grown, the key challenge is to mask the failures that arise among the system components and to improve the performance of data-intensive applications. This paper designs and implements a cluster-based distributed replication file system EDRFS to meet these critical demands. EDRFS works with a single metadata server and multiple storage nodes, deploys whole-file replication scheme at the file level, and tracks what storage node a file is replicated on. We use a linear hash algorithm to evenly distribute data and load across multiple storage nodes so as to achieve balancing workload and incremental scalability of throughput and storage capacity as the system scale grows. In addition, we employ metadata caches and file data caches in clients to enhance system performance. Furthermore, we deploy a concurrency lock scheme to avoid namespace operation bottleneck and a replicas consistency method to keep a consistent mutation order among replicas of a file. We provide the initial experimental evaluations of our prototypical system on a small-file and data-intensive workload.","PeriodicalId":191295,"journal":{"name":"2007 2nd International Conference on Communication Systems Software and Middleware","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 2nd International Conference on Communication Systems Software and Middleware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMSWA.2007.382422","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

With the system scale keeping grown, the key challenge is to mask the failures that arise among the system components and to improve the performance of data-intensive applications. This paper designs and implements a cluster-based distributed replication file system EDRFS to meet these critical demands. EDRFS works with a single metadata server and multiple storage nodes, deploys whole-file replication scheme at the file level, and tracks what storage node a file is replicated on. We use a linear hash algorithm to evenly distribute data and load across multiple storage nodes so as to achieve balancing workload and incremental scalability of throughput and storage capacity as the system scale grows. In addition, we employ metadata caches and file data caches in clients to enhance system performance. Furthermore, we deploy a concurrency lock scheme to avoid namespace operation bottleneck and a replicas consistency method to keep a consistent mutation order among replicas of a file. We provide the initial experimental evaluations of our prototypical system on a small-file and data-intensive workload.
EDRFS:适用于小文件和数据密集型应用的高效分布式复制文件系统
随着系统规模的不断扩大,关键的挑战是掩盖系统组件之间出现的故障,并提高数据密集型应用程序的性能。为了满足这些需求,本文设计并实现了一个基于集群的分布式复制文件系统EDRFS。EDRFS支持单个元数据服务器和多个存储节点,在文件级部署全文件复制方案,并跟踪文件被复制到哪个存储节点。我们使用线性哈希算法将数据和负载均匀分布在多个存储节点上,从而实现负载均衡,并随着系统规模的增长实现吞吐量和存储容量的增量可扩展性。此外,我们在客户端使用元数据缓存和文件数据缓存来提高系统性能。此外,我们部署了并发锁方案以避免命名空间操作瓶颈,并部署了副本一致性方法以保持文件副本之间的一致突变顺序。我们在一个小文件和数据密集型工作负载上对我们的原型系统进行了初步的实验评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信