Hybrid MPI+openMP Implementation of eXtended Discrete Element Method

Abdoul Wahid Mainassara Checkaraou, A. Rousset, Xavier Besseron, S. Varrette, B. Peters
{"title":"Hybrid MPI+openMP Implementation of eXtended Discrete Element Method","authors":"Abdoul Wahid Mainassara Checkaraou, A. Rousset, Xavier Besseron, S. Varrette, B. Peters","doi":"10.1109/CAHPC.2018.8645880","DOIUrl":null,"url":null,"abstract":"The Extended Discrete Element Method (XDEM) is a novel and innovative numerical simulation technique that extends classical Discrete Element Method (DEM) (which simulates the motion of granular material), by additional properties such as the chemical composition, thermodynamic state, stress/strain for each particle. It has been applied successfully to numerous industries involving the processing of granular materials such as sand, rock, wood or coke [16], [17]. In this context, computational simulation with (X)DEM has become a more and more essential tool for researchers and scientific engineers to set up and explore their experimental processes. However, increasing the size or the accuracy of a model requires the use of High Performance Computing (HPC) platforms over a parallelized implementation to accommodate the growing needs in terms of memory and computation time. In practice, such a parallelization is traditionally obtained using either MPI (distributed memory computing), openMP (shared memory computing) or hybrid approaches combining both of them. In this paper, we present the results of our effort to implement an openMP version of XDEM allowing hybrid MPI+openMP simulations (XDEM being already parallelized with MPI). Far from the basic openMP paradigm and recommendations (which simply summarizes by decorating the main computation loops with a set of openMP pragma), the openMP parallelization of XDEM required a fundamental code re-factoring and careful tuning in order to reach good performance. There are two main reasons for those difficulties. Firstly, XDEM is a legacy code developed for more than 10 years, initially focused on accuracy rather than performance. Secondly, the particles in a DEM simulation are highly dynamic: they can be added, deleted and interaction relations can change at any timestep of the simulation. Thus this article details the multiple layers of optimization applied, such as a deep data structure profiling and reorganization, the usage of fast multithreaded memory allocators and of advanced process/thread-to-core pinning techniques. Experimental results evaluate the benefit of each optimization individually and validate the implementation using a real-world application executed on the HPC platform of the University of Luxembourg. Finally, we present our Hybrid MPI+openMP results with a 15%-20% performance gain and how it overcomes scalability limits (by increasing the number of compute cores without dropping of performances) of XDEM-based pure MPI simulations.","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAHPC.2018.8645880","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

The Extended Discrete Element Method (XDEM) is a novel and innovative numerical simulation technique that extends classical Discrete Element Method (DEM) (which simulates the motion of granular material), by additional properties such as the chemical composition, thermodynamic state, stress/strain for each particle. It has been applied successfully to numerous industries involving the processing of granular materials such as sand, rock, wood or coke [16], [17]. In this context, computational simulation with (X)DEM has become a more and more essential tool for researchers and scientific engineers to set up and explore their experimental processes. However, increasing the size or the accuracy of a model requires the use of High Performance Computing (HPC) platforms over a parallelized implementation to accommodate the growing needs in terms of memory and computation time. In practice, such a parallelization is traditionally obtained using either MPI (distributed memory computing), openMP (shared memory computing) or hybrid approaches combining both of them. In this paper, we present the results of our effort to implement an openMP version of XDEM allowing hybrid MPI+openMP simulations (XDEM being already parallelized with MPI). Far from the basic openMP paradigm and recommendations (which simply summarizes by decorating the main computation loops with a set of openMP pragma), the openMP parallelization of XDEM required a fundamental code re-factoring and careful tuning in order to reach good performance. There are two main reasons for those difficulties. Firstly, XDEM is a legacy code developed for more than 10 years, initially focused on accuracy rather than performance. Secondly, the particles in a DEM simulation are highly dynamic: they can be added, deleted and interaction relations can change at any timestep of the simulation. Thus this article details the multiple layers of optimization applied, such as a deep data structure profiling and reorganization, the usage of fast multithreaded memory allocators and of advanced process/thread-to-core pinning techniques. Experimental results evaluate the benefit of each optimization individually and validate the implementation using a real-world application executed on the HPC platform of the University of Luxembourg. Finally, we present our Hybrid MPI+openMP results with a 15%-20% performance gain and how it overcomes scalability limits (by increasing the number of compute cores without dropping of performances) of XDEM-based pure MPI simulations.
扩展离散元法的混合MPI+openMP实现
扩展离散元法(XDEM)是一种新颖创新的数值模拟技术,它扩展了经典的离散元法(DEM)(模拟颗粒材料的运动),通过附加属性,如化学成分,热力学状态,每个颗粒的应力/应变。它已成功地应用于许多涉及砂石、木材或焦炭等颗粒物料加工的行业[16],[17]。在这种背景下,(X)DEM的计算模拟已经成为研究人员和科学工程师建立和探索实验过程的重要工具。然而,增加模型的大小或准确性需要在并行实现上使用高性能计算(HPC)平台,以适应在内存和计算时间方面不断增长的需求。在实践中,这种并行化通常使用MPI(分布式内存计算)、openMP(共享内存计算)或将两者结合的混合方法来获得。在本文中,我们展示了实现XDEM的openMP版本的结果,该版本允许MPI+openMP混合模拟(XDEM已经与MPI并行化)。与基本的openMP范例和建议(简单地通过使用一组openMP pragma修饰主计算循环来总结)不同,XDEM的openMP并行化需要基本的代码重构和仔细调优,以达到良好的性能。造成这些困难的主要原因有两个。首先,XDEM是开发了10多年的遗留代码,最初关注的是准确性而不是性能。其次,DEM模拟中的粒子具有高度的动态性,它们可以在模拟的任何时间步长进行添加、删除和相互作用关系的改变。因此,本文详细介绍了所应用的多层优化,例如深度数据结构分析和重组、快速多线程内存分配器的使用以及高级进程/线程到核固定技术。实验结果分别评估了每种优化的好处,并使用在卢森堡大学HPC平台上执行的实际应用程序验证了实现。最后,我们展示了我们的混合MPI+openMP结果,性能提高了15%-20%,以及它如何克服基于xdem的纯MPI模拟的可伸缩性限制(通过增加计算核心数量而不降低性能)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信