高性能计算系统中科学工作流的I/O行为仿真

Fahim Chowdhury, Yue Zhu, F. Natale, A. Moody, Elsa Gonsiorowski, K. Mohror, Weikuan Yu
{"title":"高性能计算系统中科学工作流的I/O行为仿真","authors":"Fahim Chowdhury, Yue Zhu, F. Natale, A. Moody, Elsa Gonsiorowski, K. Mohror, Weikuan Yu","doi":"10.1109/PDSW51947.2020.00011","DOIUrl":null,"url":null,"abstract":"Scientific application workflows leverage the capabilities of cutting-edge high-performance computing (HPC) facilities to enable complex applications for academia, research, and industry communities. Data transfer and I/O dependency among different modules of modern HPC workflows can increase the complexity and hamper the overall performance of workflows. Understanding this complexity due to data-dependency and dataflow is an essential prerequisite for developing optimization strategies to improve I/O performance and, eventually, the entire workflow. In this paper, we discuss dataflow patterns for workflow applications on HPC systems. As existing I/O benchmarking tools lack in identifying and representing the dataflow in modern HPC workflows, we have implemented Wemul, an open-source workflow I/O emulation framework, to mimic different types of I/O behavior demonstrated by common and complex HPC application workflows for deeper analysis. We elaborate on the features and usage of Wemul, demonstrate its application to HPC workflows, and discuss the insights from the performance analysis results on Lassen supercomputing cluster at Lawrence Livermore National Laboratory (LLNL).","PeriodicalId":142923,"journal":{"name":"2020 IEEE/ACM Fifth International Parallel Data Systems Workshop (PDSW)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Emulating I/O Behavior in Scientific Workflows on High Performance Computing Systems\",\"authors\":\"Fahim Chowdhury, Yue Zhu, F. Natale, A. Moody, Elsa Gonsiorowski, K. Mohror, Weikuan Yu\",\"doi\":\"10.1109/PDSW51947.2020.00011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scientific application workflows leverage the capabilities of cutting-edge high-performance computing (HPC) facilities to enable complex applications for academia, research, and industry communities. Data transfer and I/O dependency among different modules of modern HPC workflows can increase the complexity and hamper the overall performance of workflows. Understanding this complexity due to data-dependency and dataflow is an essential prerequisite for developing optimization strategies to improve I/O performance and, eventually, the entire workflow. In this paper, we discuss dataflow patterns for workflow applications on HPC systems. As existing I/O benchmarking tools lack in identifying and representing the dataflow in modern HPC workflows, we have implemented Wemul, an open-source workflow I/O emulation framework, to mimic different types of I/O behavior demonstrated by common and complex HPC application workflows for deeper analysis. We elaborate on the features and usage of Wemul, demonstrate its application to HPC workflows, and discuss the insights from the performance analysis results on Lassen supercomputing cluster at Lawrence Livermore National Laboratory (LLNL).\",\"PeriodicalId\":142923,\"journal\":{\"name\":\"2020 IEEE/ACM Fifth International Parallel Data Systems Workshop (PDSW)\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE/ACM Fifth International Parallel Data Systems Workshop (PDSW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDSW51947.2020.00011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/ACM Fifth International Parallel Data Systems Workshop (PDSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDSW51947.2020.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

科学应用程序工作流利用尖端高性能计算(HPC)设施的功能,为学术界、研究机构和工业界提供复杂的应用程序。现代高性能计算工作流中不同模块之间的数据传输和I/O依赖会增加工作流的复杂性,影响工作流的整体性能。理解由于数据依赖性和数据流导致的这种复杂性是开发优化策略以提高I/O性能并最终提高整个工作流的必要先决条件。本文讨论了HPC系统中工作流应用的数据流模式。由于现有的I/O基准测试工具在识别和表示现代HPC工作流中的数据流方面存在不足,我们实现了Wemul,一个开源的工作流I/O仿真框架,以模拟常见和复杂HPC应用程序工作流中不同类型的I/O行为,以进行更深入的分析。我们详细介绍了Wemul的特性和用法,演示了它在HPC工作流中的应用,并讨论了来自劳伦斯利弗莫尔国家实验室(LLNL) Lassen超级计算集群性能分析结果的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Emulating I/O Behavior in Scientific Workflows on High Performance Computing Systems
Scientific application workflows leverage the capabilities of cutting-edge high-performance computing (HPC) facilities to enable complex applications for academia, research, and industry communities. Data transfer and I/O dependency among different modules of modern HPC workflows can increase the complexity and hamper the overall performance of workflows. Understanding this complexity due to data-dependency and dataflow is an essential prerequisite for developing optimization strategies to improve I/O performance and, eventually, the entire workflow. In this paper, we discuss dataflow patterns for workflow applications on HPC systems. As existing I/O benchmarking tools lack in identifying and representing the dataflow in modern HPC workflows, we have implemented Wemul, an open-source workflow I/O emulation framework, to mimic different types of I/O behavior demonstrated by common and complex HPC application workflows for deeper analysis. We elaborate on the features and usage of Wemul, demonstrate its application to HPC workflows, and discuss the insights from the performance analysis results on Lassen supercomputing cluster at Lawrence Livermore National Laboratory (LLNL).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信