DCA-IO: A Dynamic I/O Control Scheme for Parallel and Distributed File Systems

Sunggon Kim, A. Sim, Kesheng Wu, S. Byna, Teng Wang, Yongseok Son, Hyeonsang Eom
{"title":"DCA-IO: A Dynamic I/O Control Scheme for Parallel and Distributed File Systems","authors":"Sunggon Kim, A. Sim, Kesheng Wu, S. Byna, Teng Wang, Yongseok Son, Hyeonsang Eom","doi":"10.1109/CCGRID.2019.00049","DOIUrl":null,"url":null,"abstract":"In high-performance computing, storage is a shared resource and used by all users with many different application requirements and knowledge of storage. Consequently, the optimal storage configuration varies according to the I/O behavior of each application. While system logs are helpful resources in understanding the storage behavior, it is non-trivial for each user to analyze the logs and adjust complex configurations. Even for experienced users, it is difficult to understand the full stack of I/O systems and find the optimal configuration for the specific application. In this work, we analyzed the I/O activities of CORI which is an HPC system in National Energy Research Scientific Computing Center (NERSC). The result of our analysis shows that most users do not adjust storage configurations and use the default settings. Also, it shows that only a few applications are executed repeatedly in the HPC environment. Based on this result, we have developed DCA-IO, a dynamic distributed file system configuration adjustment algorithm, which utilizes system log information and widely adapted rules to adjust storage configurations automatically without any user intervention. DCA-IO utilizes existing system logs and does not require any modifications in code or an additional library. To demonstrate the effectiveness of DCA-IO, we have performed experiments using I/O kernels of the real applications in both isolated small-sized Lustre environment and CORI. Our experimental result shows that the use of our scheme can lead to improvements in the performance of HPC applications by up to 75% in an isolated environment and 50% in a real HPC environment without user intervention.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGRID.2019.00049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

In high-performance computing, storage is a shared resource and used by all users with many different application requirements and knowledge of storage. Consequently, the optimal storage configuration varies according to the I/O behavior of each application. While system logs are helpful resources in understanding the storage behavior, it is non-trivial for each user to analyze the logs and adjust complex configurations. Even for experienced users, it is difficult to understand the full stack of I/O systems and find the optimal configuration for the specific application. In this work, we analyzed the I/O activities of CORI which is an HPC system in National Energy Research Scientific Computing Center (NERSC). The result of our analysis shows that most users do not adjust storage configurations and use the default settings. Also, it shows that only a few applications are executed repeatedly in the HPC environment. Based on this result, we have developed DCA-IO, a dynamic distributed file system configuration adjustment algorithm, which utilizes system log information and widely adapted rules to adjust storage configurations automatically without any user intervention. DCA-IO utilizes existing system logs and does not require any modifications in code or an additional library. To demonstrate the effectiveness of DCA-IO, we have performed experiments using I/O kernels of the real applications in both isolated small-sized Lustre environment and CORI. Our experimental result shows that the use of our scheme can lead to improvements in the performance of HPC applications by up to 75% in an isolated environment and 50% in a real HPC environment without user intervention.
DCA-IO:并行和分布式文件系统的动态I/O控制方案
在高性能计算中,存储是一种共享资源,由具有许多不同应用程序需求和存储知识的所有用户使用。因此,最佳存储配置根据每个应用程序的I/O行为而有所不同。虽然系统日志是理解存储行为的有用资源,但是对于每个用户来说,分析日志和调整复杂的配置是非常重要的。即使对于经验丰富的用户,也很难理解I/O系统的完整堆栈,并为特定应用程序找到最佳配置。本文对国家能源研究科学计算中心(NERSC)的高性能计算系统CORI的I/O活动进行了分析。我们的分析结果表明,大多数用户不调整存储配置,而是使用默认设置。此外,它还表明只有少数应用程序在HPC环境中被重复执行。在此基础上,我们开发了动态分布式文件系统配置调整算法DCA-IO,该算法利用系统日志信息和广泛适应的规则来自动调整存储配置,而无需用户干预。DCA-IO利用现有的系统日志,不需要对代码或其他库进行任何修改。为了证明DCA-IO的有效性,我们在孤立的小型Lustre环境和CORI中使用实际应用的I/O内核进行了实验。我们的实验结果表明,在没有用户干预的情况下,使用我们的方案可以使HPC应用程序的性能在孤立环境中提高75%,在真实HPC环境中提高50%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信