多核集群系统上显式ODE方法的性能预测

M. Scherg, Johannes Seiferth, Matthias Korch, T. Rauber
{"title":"多核集群系统上显式ODE方法的性能预测","authors":"M. Scherg, Johannes Seiferth, Matthias Korch, T. Rauber","doi":"10.1145/3297663.3310306","DOIUrl":null,"url":null,"abstract":"When migrating a scientific application to a new HPC system, the program code usually has to be re-tuned to achieve the best possible performance. Auto-tuning techniques are a promising approach to support the portability of performance. Often, a large pool of possible implementation variants exists from which the most efficient variant needs to be selected. Ideally, auto-tuning approaches should be capable of undertaking this task in an efficient manner for a new HPC system and new characteristics of the input data by applying suitable analytic models and program transformations. In this article, we discuss a performance prediction methodology for multi-core cluster applications, which can assist this selection process by significantly reducing the selection effort compared to in-depth runtime tests. The methodology proposed is an extension of an analytical performance prediction model for shared-memory applications introduced in our previous work. Our methodology is based on the execution-cache-memory (ECM) performance model and estimations of intra-node and inter-node communication costs, which we apply to numerical solution methods for ordinary differential equations (ODEs). In particular, we investigate whether it is possible to obtain accurate performance predictions for hybrid MPI/OpenMP implementation variants in order to support the variant selection. We demonstrate that our approach is able to reliably select a set of efficient variants for a given configuration (ODE system, solver and hardware platform) and, thus, to narrow down the search space for possible later empirical tuning.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Performance Prediction of Explicit ODE Methods on Multi-Core Cluster Systems\",\"authors\":\"M. Scherg, Johannes Seiferth, Matthias Korch, T. Rauber\",\"doi\":\"10.1145/3297663.3310306\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When migrating a scientific application to a new HPC system, the program code usually has to be re-tuned to achieve the best possible performance. Auto-tuning techniques are a promising approach to support the portability of performance. Often, a large pool of possible implementation variants exists from which the most efficient variant needs to be selected. Ideally, auto-tuning approaches should be capable of undertaking this task in an efficient manner for a new HPC system and new characteristics of the input data by applying suitable analytic models and program transformations. In this article, we discuss a performance prediction methodology for multi-core cluster applications, which can assist this selection process by significantly reducing the selection effort compared to in-depth runtime tests. The methodology proposed is an extension of an analytical performance prediction model for shared-memory applications introduced in our previous work. Our methodology is based on the execution-cache-memory (ECM) performance model and estimations of intra-node and inter-node communication costs, which we apply to numerical solution methods for ordinary differential equations (ODEs). In particular, we investigate whether it is possible to obtain accurate performance predictions for hybrid MPI/OpenMP implementation variants in order to support the variant selection. We demonstrate that our approach is able to reliably select a set of efficient variants for a given configuration (ODE system, solver and hardware platform) and, thus, to narrow down the search space for possible later empirical tuning.\",\"PeriodicalId\":273447,\"journal\":{\"name\":\"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3297663.3310306\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3297663.3310306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

当将科学应用程序迁移到新的HPC系统时,通常必须重新调整程序代码以实现最佳性能。自动调优技术是支持性能可移植性的一种很有前途的方法。通常,存在大量可能的实现变体,需要从中选择最有效的变体。理想情况下,通过应用合适的分析模型和程序转换,自动调整方法应该能够以一种有效的方式为新的高性能计算系统和输入数据的新特征承担这项任务。在本文中,我们将讨论一种多核集群应用程序的性能预测方法,与深入的运行时测试相比,该方法可以显著减少选择工作量,从而帮助选择过程。所提出的方法是对我们之前工作中介绍的共享内存应用程序的分析性能预测模型的扩展。我们的方法是基于执行-缓存-内存(ECM)性能模型和节点内和节点间通信成本的估计,我们将其应用于常微分方程(ode)的数值解方法。特别是,我们研究了是否有可能获得混合MPI/OpenMP实现变体的准确性能预测,以支持变体选择。我们证明,我们的方法能够可靠地为给定的配置(ODE系统、求解器和硬件平台)选择一组有效的变体,从而缩小搜索空间,以便以后可能的经验调整。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Performance Prediction of Explicit ODE Methods on Multi-Core Cluster Systems
When migrating a scientific application to a new HPC system, the program code usually has to be re-tuned to achieve the best possible performance. Auto-tuning techniques are a promising approach to support the portability of performance. Often, a large pool of possible implementation variants exists from which the most efficient variant needs to be selected. Ideally, auto-tuning approaches should be capable of undertaking this task in an efficient manner for a new HPC system and new characteristics of the input data by applying suitable analytic models and program transformations. In this article, we discuss a performance prediction methodology for multi-core cluster applications, which can assist this selection process by significantly reducing the selection effort compared to in-depth runtime tests. The methodology proposed is an extension of an analytical performance prediction model for shared-memory applications introduced in our previous work. Our methodology is based on the execution-cache-memory (ECM) performance model and estimations of intra-node and inter-node communication costs, which we apply to numerical solution methods for ordinary differential equations (ODEs). In particular, we investigate whether it is possible to obtain accurate performance predictions for hybrid MPI/OpenMP implementation variants in order to support the variant selection. We demonstrate that our approach is able to reliably select a set of efficient variants for a given configuration (ODE system, solver and hardware platform) and, thus, to narrow down the search space for possible later empirical tuning.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信