A Server-Level Adaptive Data Layout Strategy for Parallel File Systems

Huaiming Song, Hui Jin, Jun He, Xian-He Sun, R. Thakur
{"title":"A Server-Level Adaptive Data Layout Strategy for Parallel File Systems","authors":"Huaiming Song, Hui Jin, Jun He, Xian-He Sun, R. Thakur","doi":"10.1109/IPDPSW.2012.246","DOIUrl":null,"url":null,"abstract":"Parallel file systems are widely used for providing a high degree of I/O parallelism to mask the gap between I/O and memory speed. However, peak I/O performance is rarely attained due to complex data access patterns of applications. Based on the observation that the I/O performance of small requests is often limited by the request service rate, and the performance of large requests is limited by I/O bandwidth, we take into consideration both factors and propose a server-level adaptive data layout strategy. The proposed strategy adopts different stripe sizes for different file servers according to the data access characteristics on each individual server. We let the file servers that can fully utilize bandwidth hold more data, and the file servers that are limited with request service rate hold less data. As a result, heavy-load servers can offload some data accesses to light-load servers for potential improvement of I/O performance. We present a method to measure access cost for each data block and then utilize an equal-depth histogram approach to distributed data blocks across multiple servers adaptively, so as to balance data accesses on all file servers. Analytical and experimental results demonstrate that the proposed server-level adaptive layout strategy can improve I/O performance by as much as 80.3% and is more appropriate for applications with complex data access patterns.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2012.246","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

Abstract

Parallel file systems are widely used for providing a high degree of I/O parallelism to mask the gap between I/O and memory speed. However, peak I/O performance is rarely attained due to complex data access patterns of applications. Based on the observation that the I/O performance of small requests is often limited by the request service rate, and the performance of large requests is limited by I/O bandwidth, we take into consideration both factors and propose a server-level adaptive data layout strategy. The proposed strategy adopts different stripe sizes for different file servers according to the data access characteristics on each individual server. We let the file servers that can fully utilize bandwidth hold more data, and the file servers that are limited with request service rate hold less data. As a result, heavy-load servers can offload some data accesses to light-load servers for potential improvement of I/O performance. We present a method to measure access cost for each data block and then utilize an equal-depth histogram approach to distributed data blocks across multiple servers adaptively, so as to balance data accesses on all file servers. Analytical and experimental results demonstrate that the proposed server-level adaptive layout strategy can improve I/O performance by as much as 80.3% and is more appropriate for applications with complex data access patterns.
并行文件系统的服务器级自适应数据布局策略
并行文件系统广泛用于提供高度的I/O并行性,以掩盖I/O速度和内存速度之间的差距。然而,由于应用程序的复杂数据访问模式,很难达到峰值I/O性能。基于小请求的I/O性能通常受到请求服务速率的限制,而大请求的I/O性能通常受到I/O带宽的限制,我们综合考虑了这两个因素,提出了一种服务器级自适应数据布局策略。该策略根据每个服务器上的数据访问特性,对不同的文件服务器采用不同的条带大小。我们让能够充分利用带宽的文件服务器持有更多的数据,而受请求服务速率限制的文件服务器持有更少的数据。因此,重负载服务器可以将一些数据访问卸载到轻负载服务器上,从而潜在地提高I/O性能。我们提出了一种方法来测量每个数据块的访问成本,然后利用等深度直方图方法自适应地跨多个服务器分布式数据块,从而平衡所有文件服务器上的数据访问。分析和实验结果表明,所提出的服务器级自适应布局策略可将I/O性能提高80.3%,更适合具有复杂数据访问模式的应用程序。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信