LDPP: A Learned Directory Placement Policy in Distributed File Systems

Yuanzhang Wang, Fengkui Yang, Ji Zhang, Chun-hua Li, Ke Zhou, Chong Liu, Zhuo Cheng, Wei Fang, Jinhu Liu
{"title":"LDPP: A Learned Directory Placement Policy in Distributed File Systems","authors":"Yuanzhang Wang, Fengkui Yang, Ji Zhang, Chun-hua Li, Ke Zhou, Chong Liu, Zhuo Cheng, Wei Fang, Jinhu Liu","doi":"10.1145/3545008.3545057","DOIUrl":null,"url":null,"abstract":"Load balance is a critical problem in distributed file systems. Previous works focus on how to distribute data evenly on different nodes or storage devices from the perspective of file level, but neglect to effectively take advantage of the directory’s locality and the long duration of the directory’s hotness, which may affect the degree of balance and cause performance degradation. To overcome this shortcoming, in this paper, we propose a learning-based directory placement policy, called LDPP, which determines the data layout by predicting the load. We first establish a relationship between directory request characteristics and state information to predict the state information of the directory (storage capacity, bandwidth, and IOPS). Then, the new directory is placed on different nodes in a multi-dimensional manner based on the Manhattan distance according to the predicted multidimensional state information. In addition, we also take into account the trade-off between the same category directory classified by the load prediction module and the peer directories and explore their influence on the balance. Extensive experiments demonstrate that LDPP not only efficiently alleviates load imbalance and increases the utilization of the resources but also improves DFS performance in practice, which can reduce service latency by up to 36 and increase IOPS and bandwidth by 8 and 9, respectively.","PeriodicalId":360504,"journal":{"name":"Proceedings of the 51st International Conference on Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 51st International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3545008.3545057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Load balance is a critical problem in distributed file systems. Previous works focus on how to distribute data evenly on different nodes or storage devices from the perspective of file level, but neglect to effectively take advantage of the directory’s locality and the long duration of the directory’s hotness, which may affect the degree of balance and cause performance degradation. To overcome this shortcoming, in this paper, we propose a learning-based directory placement policy, called LDPP, which determines the data layout by predicting the load. We first establish a relationship between directory request characteristics and state information to predict the state information of the directory (storage capacity, bandwidth, and IOPS). Then, the new directory is placed on different nodes in a multi-dimensional manner based on the Manhattan distance according to the predicted multidimensional state information. In addition, we also take into account the trade-off between the same category directory classified by the load prediction module and the peer directories and explore their influence on the balance. Extensive experiments demonstrate that LDPP not only efficiently alleviates load imbalance and increases the utilization of the resources but also improves DFS performance in practice, which can reduce service latency by up to 36 and increase IOPS and bandwidth by 8 and 9, respectively.
LDPP:分布式文件系统中的学习目录放置策略
负载平衡是分布式文件系统中的一个关键问题。以往的工作主要是从文件级的角度考虑如何将数据均匀地分布在不同的节点或存储设备上,而忽略了有效地利用目录的局部性和目录热度持续时间长的特点,这可能会影响均衡程度,导致性能下降。为了克服这一缺点,本文提出了一种基于学习的目录放置策略,称为LDPP,它通过预测负载来确定数据布局。我们首先建立目录请求特征与状态信息之间的关系,预测目录的状态信息(存储容量、带宽和IOPS)。然后,根据预测的多维状态信息,基于曼哈顿距离,以多维方式将新目录放置在不同的节点上。此外,我们还考虑了负载预测模块分类的同类目录与对等目录之间的权衡,并探讨了它们对平衡的影响。大量实验表明,LDPP不仅有效地缓解了负载不均衡,提高了资源利用率,而且在实践中提高了DFS的性能,可以将业务延迟减少36%,IOPS和带宽分别提高8和9。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信