How I Learned to Stop Worrying and Love In Situ Analytics: Leveraging Latent Synchronization in MPI Collective Algorithms

Proceedings of the 23rd European MPI Users' Group Meeting Pub Date : 2016-09-25 DOI:10.1145/2966884.2966920

Scott Levy, Kurt B. Ferreira, Patrick M. Widener, P. Bridges, Oscar H. Mondragon

{"title":"How I Learned to Stop Worrying and Love In Situ Analytics: Leveraging Latent Synchronization in MPI Collective Algorithms","authors":"Scott Levy, Kurt B. Ferreira, Patrick M. Widener, P. Bridges, Oscar H. Mondragon","doi":"10.1145/2966884.2966920","DOIUrl":null,"url":null,"abstract":"Scientific workloads running on current extreme-scale systems routinely generate tremendous volumes of data for postprocessing. This data movement has become a serious issue due to its energy cost and the fact that I/O bandwidths have not kept pace with data generation rates. In situ analytics is an increasingly popular alternative in which post-simulation processing is embedded into an application, running as part of the same MPI job. This can reduce data movement costs but introduces a new potential source of interference for the application. Using a validated simulation-based approach, we investigate how best to mitigate the interference from time-shared in situ tasks for a number of key extreme-scale workloads. This paper makes a number of contributions. First, we show that the independent scheduling of in situ analytics tasks can significantly degradation application performance, with slowdowns exceeding 1000%. Second, we demonstrate that the degree of synchronization found in many modern collective algorithms is sufficient to significantly reduce the overheads of this interference to less than 10% in most cases. Finally, we show that many applications already frequently invoke collective operations that use these synchronizing MPI algorithms. Therefore, the syncronization introduced by these MPI collective algorithms can be leveraged to efficiently schedule analytics tasks with minimal changes to existing applications. This paper provides critical analysis and guidance for MPI users and developers on the importance of scheduling in situ analytics tasks. It shows the degree of synchronization needed to mitigate the performance impacts of these time-shared coupled codes and demonstrates how that synchronization can be realized in an extreme-scale environment using modern collective algorithms.","PeriodicalId":264069,"journal":{"name":"Proceedings of the 23rd European MPI Users' Group Meeting","volume":"114 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd European MPI Users' Group Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2966884.2966920","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Scientific workloads running on current extreme-scale systems routinely generate tremendous volumes of data for postprocessing. This data movement has become a serious issue due to its energy cost and the fact that I/O bandwidths have not kept pace with data generation rates. In situ analytics is an increasingly popular alternative in which post-simulation processing is embedded into an application, running as part of the same MPI job. This can reduce data movement costs but introduces a new potential source of interference for the application. Using a validated simulation-based approach, we investigate how best to mitigate the interference from time-shared in situ tasks for a number of key extreme-scale workloads. This paper makes a number of contributions. First, we show that the independent scheduling of in situ analytics tasks can significantly degradation application performance, with slowdowns exceeding 1000%. Second, we demonstrate that the degree of synchronization found in many modern collective algorithms is sufficient to significantly reduce the overheads of this interference to less than 10% in most cases. Finally, we show that many applications already frequently invoke collective operations that use these synchronizing MPI algorithms. Therefore, the syncronization introduced by these MPI collective algorithms can be leveraged to efficiently schedule analytics tasks with minimal changes to existing applications. This paper provides critical analysis and guidance for MPI users and developers on the importance of scheduling in situ analytics tasks. It shows the degree of synchronization needed to mitigate the performance impacts of these time-shared coupled codes and demonstrates how that synchronization can be realized in an extreme-scale environment using modern collective algorithms.

查看原文本刊更多论文

我是如何学会停止担忧和热爱原位分析的:利用MPI集体算法中的潜在同步

在当前的极端规模系统上运行的科学工作负载通常会产生大量的数据进行后处理。由于其能源成本和I/O带宽跟不上数据生成速率的事实，这种数据移动已经成为一个严重的问题。原位分析是一种日益流行的替代方案，它将模拟后处理嵌入到应用程序中，作为同一MPI作业的一部分运行。这可以降低数据移动成本，但也为应用程序引入了一个新的潜在干扰源。使用经过验证的基于仿真的方法，我们研究了如何最好地减轻一些关键极端规模工作负载的分时原位任务的干扰。这篇论文做出了许多贡献。首先，我们证明了原位分析任务的独立调度会显著降低应用程序的性能，降低幅度超过1000%。其次，我们证明了在许多现代集体算法中发现的同步程度足以在大多数情况下将这种干扰的开销显著降低到10%以下。最后，我们展示了许多应用程序已经频繁地调用使用这些同步MPI算法的集合操作。因此，可以利用这些MPI集合算法引入的同步，在对现有应用程序进行最小更改的情况下有效地调度分析任务。本文为MPI用户和开发人员提供了重要的分析和指导，说明调度原位分析任务的重要性。它展示了减轻这些分时耦合代码的性能影响所需的同步程度，并演示了如何使用现代集体算法在极端规模的环境中实现同步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 23rd European MPI Users' Group Meeting

自引率

0.00%

发文量