Towards Dynamic Resource Management with MPI Sessions and PMIx

Dominik Huber, Maximilian Streubel, Isaías Comprés, M. Schulz, Martin Schreiber, H. Pritchard
{"title":"Towards Dynamic Resource Management with MPI Sessions and PMIx","authors":"Dominik Huber, Maximilian Streubel, Isaías Comprés, M. Schulz, Martin Schreiber, H. Pritchard","doi":"10.1145/3555819.3555856","DOIUrl":null,"url":null,"abstract":"Job management software on peta- and exascale supercomputers continues to provide static resource allocations, from a program’s start until its end. Dynamic resource allocation and management is a research direction that has the potential to improve the efficiency of HPC systems and applications by dynamically adapting the resources of an application during its runtime. Resources can be adapted based on past, current or even future system conditions and matching optimization targets. However, the implementation of dynamic resource management is challenging as it requires support across many layers of the software stack, including the programming model. In this paper, we focus on the latter and present our approach to extend MPI Sessions to support dynamic resource allocations within MPI applications. While some forms of dynamicity already exist in MPI, it is currently limited by requiring global synchronization, being application or application-domain specific, or by suffering from limited support in current HPC system software stacks. We overcome these limitations with a simple, yet powerful abstraction: resources as process sets, and changes of resources as set operations leading to a graph-based perspective on resource changes. As the main contribution of this work, we provide an implementation of this approach based on MPI Sessions and PMIx. In addition, an illustration of its usage is provided, as well as a discussion about the required extensions of the PMIx standard. We report results based on a prototype implementation with Open MPI using a synthetic application, as well as a PDE solver benchmark on up to four nodes and a total of 112 cores. Overall, our results show the feasibility of our approach, which has only very moderate overheads. We see this first proof-of-concept as an important step towards resource adaptivity based on MPI Sessions.","PeriodicalId":423846,"journal":{"name":"Proceedings of the 29th European MPI Users' Group Meeting","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 29th European MPI Users' Group Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3555819.3555856","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Job management software on peta- and exascale supercomputers continues to provide static resource allocations, from a program’s start until its end. Dynamic resource allocation and management is a research direction that has the potential to improve the efficiency of HPC systems and applications by dynamically adapting the resources of an application during its runtime. Resources can be adapted based on past, current or even future system conditions and matching optimization targets. However, the implementation of dynamic resource management is challenging as it requires support across many layers of the software stack, including the programming model. In this paper, we focus on the latter and present our approach to extend MPI Sessions to support dynamic resource allocations within MPI applications. While some forms of dynamicity already exist in MPI, it is currently limited by requiring global synchronization, being application or application-domain specific, or by suffering from limited support in current HPC system software stacks. We overcome these limitations with a simple, yet powerful abstraction: resources as process sets, and changes of resources as set operations leading to a graph-based perspective on resource changes. As the main contribution of this work, we provide an implementation of this approach based on MPI Sessions and PMIx. In addition, an illustration of its usage is provided, as well as a discussion about the required extensions of the PMIx standard. We report results based on a prototype implementation with Open MPI using a synthetic application, as well as a PDE solver benchmark on up to four nodes and a total of 112 cores. Overall, our results show the feasibility of our approach, which has only very moderate overheads. We see this first proof-of-concept as an important step towards resource adaptivity based on MPI Sessions.
用MPI会话和PMIx实现动态资源管理
peta级和exascale级超级计算机上的作业管理软件从程序的开始到结束都持续提供静态资源分配。动态资源分配和管理是一个研究方向,它有可能通过在运行时动态调整应用程序的资源来提高高性能计算系统和应用程序的效率。资源可以根据过去、当前甚至未来的系统条件和匹配的优化目标进行调整。然而,动态资源管理的实现是具有挑战性的,因为它需要跨软件堆栈的许多层的支持,包括编程模型。在本文中,我们重点关注后者,并提出了扩展MPI会话以支持MPI应用程序内动态资源分配的方法。虽然MPI中已经存在某些形式的动态性,但目前由于需要全局同步,特定于应用程序或应用程序领域,或者由于当前HPC系统软件堆栈的有限支持而受到限制。我们通过一个简单而强大的抽象来克服这些限制:将资源作为流程集,将资源的更改作为集操作,从而对资源更改进行基于图的透视图。作为这项工作的主要贡献,我们提供了基于MPI会话和PMIx的这种方法的实现。此外,还提供了它的用法说明,以及关于PMIx标准所需扩展的讨论。我们报告了基于Open MPI使用合成应用程序的原型实现的结果,以及在多达四个节点和总共112个核心上的PDE求解器基准测试。总的来说,我们的结果显示了我们的方法的可行性,它只有非常适度的开销。我们将这第一次概念验证视为迈向基于MPI会话的资源适应性的重要一步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信