Large Scale Manycore-Aware PIC Simulation with Efficient Particle Binning

H. Nakashima, Yoshiki Summura, Keisuke Kikura, Y. Miyake
{"title":"Large Scale Manycore-Aware PIC Simulation with Efficient Particle Binning","authors":"H. Nakashima, Yoshiki Summura, Keisuke Kikura, Y. Miyake","doi":"10.1109/IPDPS.2017.65","DOIUrl":null,"url":null,"abstract":"We are now developing a manycore-aware implementation of multiprocessed PIC (particle-in-cell) simulation code with automatic load balancing. A key issue of the implementation is how to exploit the wide SIMD mechanism of manycore processors such as Intel Xeon Phi. Our solution is \"particle binning\" to rank all particles in a cell (voxel) in a chunk of SOA (structure-of-arrays) type one-dimensional arrays so that \"particle-push\" and \"current-scatter\" operations on them are efficiently SIMD-vectorized by our compiler. In addition, our sophisticated binning mechanism performs sorting of particles according to their positions \"on-the-fly\", efficiently coping with occasional \"bin overflow\" in a fully multithreaded manner. Our performance evaluation with up to 64 nodes of Cray XC30 and XC40 supercomputers, equipped with Xeon Phi 5120D (Knights Corner) and 7250 (Knights Landing) respectively, not only exhibited good parallel performance, but also proved the effectiveness of our binning mechanism.","PeriodicalId":209524,"journal":{"name":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2017.65","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

We are now developing a manycore-aware implementation of multiprocessed PIC (particle-in-cell) simulation code with automatic load balancing. A key issue of the implementation is how to exploit the wide SIMD mechanism of manycore processors such as Intel Xeon Phi. Our solution is "particle binning" to rank all particles in a cell (voxel) in a chunk of SOA (structure-of-arrays) type one-dimensional arrays so that "particle-push" and "current-scatter" operations on them are efficiently SIMD-vectorized by our compiler. In addition, our sophisticated binning mechanism performs sorting of particles according to their positions "on-the-fly", efficiently coping with occasional "bin overflow" in a fully multithreaded manner. Our performance evaluation with up to 64 nodes of Cray XC30 and XC40 supercomputers, equipped with Xeon Phi 5120D (Knights Corner) and 7250 (Knights Landing) respectively, not only exhibited good parallel performance, but also proved the effectiveness of our binning mechanism.
基于高效粒子分割的大规模多核感知PIC仿真
我们现在正在开发具有自动负载平衡的多核感知多处理PIC (particle-in-cell)仿真代码的实现。实现的一个关键问题是如何利用多核处理器(如Intel Xeon Phi)的宽SIMD机制。我们的解决方案是“粒子分组”,在SOA(数组结构)类型的一维数组块中对单元格(体素)中的所有粒子进行排序,以便我们的编译器对它们进行“粒子推进”和“电流散射”操作,从而有效地进行simd矢量化。此外,我们先进的分仓机制根据颗粒的位置“实时”进行分类,以完全多线程的方式有效地应对偶尔的“分仓溢出”。我们对64个节点的Cray XC30和XC40超级计算机进行性能评估,分别配备Xeon Phi 5120D (Knights Corner)和7250 (Knights Landing),不仅表现出良好的并行性能,而且证明了我们的分组机制的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信