Towards Heterogeneous Multi-scale Computing on Large Scale Parallel Supercomputers

Supercomput. Front. Innov. Pub Date : 2019-12-01 DOI:10.14529/jsfi190402

S. Alowayyed, M. Vassaux, B. Czaja, P. Coveney, A. Hoekstra

{"title":"Towards Heterogeneous Multi-scale Computing on Large Scale Parallel Supercomputers","authors":"S. Alowayyed, M. Vassaux, B. Czaja, P. Coveney, A. Hoekstra","doi":"10.14529/jsfi190402","DOIUrl":null,"url":null,"abstract":"New applications that can exploit emerging exascale computing resources efficiently, while providing meaningful scientific results, are eagerly anticipated. Multi-scale models, especially multi-scale applications, will assuredly run at the exascale. We have established that a class of multi-scale applications implementing the heterogeneous multi-scale model follows, a heterogeneous multi-scale computing (HMC) pattern, which typically features a macroscopic model synchronising numerous independent microscopic model simulations. Consequently, communication between microscopic simulations is limited. Furthermore, a surrogate model can often be introduced between macro-scale and micro-scale models to interpolate required data from previously computed micro-scale simulations, thereby substantially reducing the number of micro-scale simulations. Nonetheless, HMC applications, though versatile, remain constrained by load balancing issues. We discuss two main issues: the a priori unknown and variable execution time of microscopic simulations, and the dynamic number of micro-scale simulations required. We tackle execution time variability using a pilot job mechanism to handle internal queuing and multiple sub-model execution on large-scale supercomputers, together with a data-informed execution time prediction model. To dynamically select the number of micro-scale simulations, the HMC pattern automatically detects and identifies three surrogate model phases that help control the available and used core amount. After relevant phase detection and micro-scale simulation scheduling, any idle cores can be used for surrogate model update or for processor release back to the system. We demonstrate HMC performance by testing it on two representative multi-scale applications. We conclude that, considering the subtle interplay between the macroscale model, surrogate models and micro-scale simulations, HMC provides a promising path towards exascale for many multiscale applications.","PeriodicalId":338883,"journal":{"name":"Supercomput. Front. Innov.","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Supercomput. Front. Innov.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14529/jsfi190402","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

New applications that can exploit emerging exascale computing resources efficiently, while providing meaningful scientific results, are eagerly anticipated. Multi-scale models, especially multi-scale applications, will assuredly run at the exascale. We have established that a class of multi-scale applications implementing the heterogeneous multi-scale model follows, a heterogeneous multi-scale computing (HMC) pattern, which typically features a macroscopic model synchronising numerous independent microscopic model simulations. Consequently, communication between microscopic simulations is limited. Furthermore, a surrogate model can often be introduced between macro-scale and micro-scale models to interpolate required data from previously computed micro-scale simulations, thereby substantially reducing the number of micro-scale simulations. Nonetheless, HMC applications, though versatile, remain constrained by load balancing issues. We discuss two main issues: the a priori unknown and variable execution time of microscopic simulations, and the dynamic number of micro-scale simulations required. We tackle execution time variability using a pilot job mechanism to handle internal queuing and multiple sub-model execution on large-scale supercomputers, together with a data-informed execution time prediction model. To dynamically select the number of micro-scale simulations, the HMC pattern automatically detects and identifies three surrogate model phases that help control the available and used core amount. After relevant phase detection and micro-scale simulation scheduling, any idle cores can be used for surrogate model update or for processor release back to the system. We demonstrate HMC performance by testing it on two representative multi-scale applications. We conclude that, considering the subtle interplay between the macroscale model, surrogate models and micro-scale simulations, HMC provides a promising path towards exascale for many multiscale applications.

查看原文本刊更多论文

面向大规模并行超级计算机的异构多尺度计算

人们热切期待新的应用程序能够有效地利用新兴的百亿亿次计算资源，同时提供有意义的科学结果。多尺度模型，特别是多尺度应用，肯定会在百亿亿次上运行。我们已经建立了一类实现异构多尺度模型的多尺度应用程序，遵循异构多尺度计算(HMC)模式，其典型特征是宏观模型同步许多独立的微观模型模拟。因此，微观模拟之间的通信是有限的。此外，通常可以在宏观尺度和微观尺度模型之间引入替代模型，以插入先前计算的微观尺度模拟所需的数据，从而大大减少微观尺度模拟的数量。尽管如此，HMC应用程序虽然用途广泛，但仍然受到负载平衡问题的限制。我们讨论了两个主要问题:微观模拟的先验未知和可变执行时间，以及所需的微观尺度模拟的动态数量。我们使用试点作业机制来处理大型超级计算机上的内部排队和多子模型执行，以及数据通知的执行时间预测模型来解决执行时间的可变性。为了动态选择微尺度模拟的数量，HMC模式自动检测和识别三个代理模型阶段，这些阶段有助于控制可用和已使用的核心数量。在相关的相位检测和微尺度仿真调度之后，任何空闲的内核都可以用于代理模型更新或将处理器释放回系统。我们通过在两个代表性的多尺度应用程序上测试HMC来演示其性能。我们的结论是，考虑到宏观尺度模型、替代模型和微观尺度模拟之间微妙的相互作用，HMC为许多多尺度应用提供了通往百亿亿次的有希望的途径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Supercomput. Front. Innov.

自引率

0.00%

发文量