How I Learned to Stop Worrying and Love In Situ Analytics: Leveraging Latent Synchronization in MPI Collective Algorithms

Scott Levy, Kurt B. Ferreira, Patrick M. Widener, P. Bridges, Oscar H. Mondragon
{"title":"How I Learned to Stop Worrying and Love In Situ Analytics: Leveraging Latent Synchronization in MPI Collective Algorithms","authors":"Scott Levy, Kurt B. Ferreira, Patrick M. Widener, P. Bridges, Oscar H. Mondragon","doi":"10.1145/2966884.2966920","DOIUrl":null,"url":null,"abstract":"Scientific workloads running on current extreme-scale systems routinely generate tremendous volumes of data for postprocessing. This data movement has become a serious issue due to its energy cost and the fact that I/O bandwidths have not kept pace with data generation rates. In situ analytics is an increasingly popular alternative in which post-simulation processing is embedded into an application, running as part of the same MPI job. This can reduce data movement costs but introduces a new potential source of interference for the application. Using a validated simulation-based approach, we investigate how best to mitigate the interference from time-shared in situ tasks for a number of key extreme-scale workloads. This paper makes a number of contributions. First, we show that the independent scheduling of in situ analytics tasks can significantly degradation application performance, with slowdowns exceeding 1000%. Second, we demonstrate that the degree of synchronization found in many modern collective algorithms is sufficient to significantly reduce the overheads of this interference to less than 10% in most cases. Finally, we show that many applications already frequently invoke collective operations that use these synchronizing MPI algorithms. Therefore, the syncronization introduced by these MPI collective algorithms can be leveraged to efficiently schedule analytics tasks with minimal changes to existing applications. This paper provides critical analysis and guidance for MPI users and developers on the importance of scheduling in situ analytics tasks. It shows the degree of synchronization needed to mitigate the performance impacts of these time-shared coupled codes and demonstrates how that synchronization can be realized in an extreme-scale environment using modern collective algorithms.","PeriodicalId":264069,"journal":{"name":"Proceedings of the 23rd European MPI Users' Group Meeting","volume":"114 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd European MPI Users' Group Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2966884.2966920","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Scientific workloads running on current extreme-scale systems routinely generate tremendous volumes of data for postprocessing. This data movement has become a serious issue due to its energy cost and the fact that I/O bandwidths have not kept pace with data generation rates. In situ analytics is an increasingly popular alternative in which post-simulation processing is embedded into an application, running as part of the same MPI job. This can reduce data movement costs but introduces a new potential source of interference for the application. Using a validated simulation-based approach, we investigate how best to mitigate the interference from time-shared in situ tasks for a number of key extreme-scale workloads. This paper makes a number of contributions. First, we show that the independent scheduling of in situ analytics tasks can significantly degradation application performance, with slowdowns exceeding 1000%. Second, we demonstrate that the degree of synchronization found in many modern collective algorithms is sufficient to significantly reduce the overheads of this interference to less than 10% in most cases. Finally, we show that many applications already frequently invoke collective operations that use these synchronizing MPI algorithms. Therefore, the syncronization introduced by these MPI collective algorithms can be leveraged to efficiently schedule analytics tasks with minimal changes to existing applications. This paper provides critical analysis and guidance for MPI users and developers on the importance of scheduling in situ analytics tasks. It shows the degree of synchronization needed to mitigate the performance impacts of these time-shared coupled codes and demonstrates how that synchronization can be realized in an extreme-scale environment using modern collective algorithms.
我是如何学会停止担忧和热爱原位分析的:利用MPI集体算法中的潜在同步
在当前的极端规模系统上运行的科学工作负载通常会产生大量的数据进行后处理。由于其能源成本和I/O带宽跟不上数据生成速率的事实,这种数据移动已经成为一个严重的问题。原位分析是一种日益流行的替代方案,它将模拟后处理嵌入到应用程序中,作为同一MPI作业的一部分运行。这可以降低数据移动成本,但也为应用程序引入了一个新的潜在干扰源。使用经过验证的基于仿真的方法,我们研究了如何最好地减轻一些关键极端规模工作负载的分时原位任务的干扰。这篇论文做出了许多贡献。首先,我们证明了原位分析任务的独立调度会显著降低应用程序的性能,降低幅度超过1000%。其次,我们证明了在许多现代集体算法中发现的同步程度足以在大多数情况下将这种干扰的开销显著降低到10%以下。最后,我们展示了许多应用程序已经频繁地调用使用这些同步MPI算法的集合操作。因此,可以利用这些MPI集合算法引入的同步,在对现有应用程序进行最小更改的情况下有效地调度分析任务。本文为MPI用户和开发人员提供了重要的分析和指导,说明调度原位分析任务的重要性。它展示了减轻这些分时耦合代码的性能影响所需的同步程度,并演示了如何使用现代集体算法在极端规模的环境中实现同步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信