Supporting task-level fault-tolerance in HPC workflows by launching MPI jobs inside MPI jobs

Matthieu Dorier, J. Wozniak, R. Ross
{"title":"Supporting task-level fault-tolerance in HPC workflows by launching MPI jobs inside MPI jobs","authors":"Matthieu Dorier, J. Wozniak, R. Ross","doi":"10.1145/3150994.3151001","DOIUrl":null,"url":null,"abstract":"While the use of workflows for HPC is growing, MPI interoperability remains a challenge for workflow management systems. The MPI standard and/or its implementations provide a number of ways to build multiple-programs-multiple-data (MPMD) applications. These methods present limitations related to fault tolerance, and are not easy to use. In this paper, we advocate for a novel MPI_Comm_launch function acting as the parallel counterpart of a system(3) call. MPI_Comm_launch allows a child MPI application to be launched inside the resources originally held by processes of a parent MPI application. Two important aspects of MPI_Comm_launch is that it pauses the calling process, and runs the child processes on the parent's CPU cores, but in an isolated manner with respect to memory. This function makes it easier to build MPMD applications with well-decoupled subtasks. We show how this feature can provide better flexibility and better fault tolerance in ensemble simulations and HPC workflows. We report results showing 2x throughput improvement for application workflows with faults, and scaling results for challenging workloads up to 256 nodes.","PeriodicalId":228111,"journal":{"name":"Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3150994.3151001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

While the use of workflows for HPC is growing, MPI interoperability remains a challenge for workflow management systems. The MPI standard and/or its implementations provide a number of ways to build multiple-programs-multiple-data (MPMD) applications. These methods present limitations related to fault tolerance, and are not easy to use. In this paper, we advocate for a novel MPI_Comm_launch function acting as the parallel counterpart of a system(3) call. MPI_Comm_launch allows a child MPI application to be launched inside the resources originally held by processes of a parent MPI application. Two important aspects of MPI_Comm_launch is that it pauses the calling process, and runs the child processes on the parent's CPU cores, but in an isolated manner with respect to memory. This function makes it easier to build MPMD applications with well-decoupled subtasks. We show how this feature can provide better flexibility and better fault tolerance in ensemble simulations and HPC workflows. We report results showing 2x throughput improvement for application workflows with faults, and scaling results for challenging workloads up to 256 nodes.
通过在MPI作业中启动MPI作业来支持HPC工作流中的任务级容错
虽然HPC工作流程的使用正在增长,但MPI互操作性仍然是工作流管理系统的一个挑战。MPI标准和/或其实现提供了许多构建多程序多数据(MPMD)应用程序的方法。这些方法存在与容错相关的限制,并且不容易使用。在本文中,我们提倡一种新的MPI_Comm_launch函数作为系统(3)调用的并行对应。MPI_Comm_launch允许子MPI应用程序在父MPI应用程序的进程最初持有的资源中启动。MPI_Comm_launch的两个重要方面是,它暂停调用进程,并在父进程的CPU内核上运行子进程,但在内存方面以隔离的方式运行。这个函数可以更容易地构建具有良好解耦子任务的MPMD应用程序。我们将展示该特性如何在集成仿真和HPC工作流中提供更好的灵活性和容错性。我们报告的结果显示,对于有故障的应用程序工作流,吞吐量提高了2倍,对于具有挑战性的工作负载,可扩展到256个节点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信