自适应笛卡尔网格CFD求解器在当前处理器体系结构中的加速

V. HarichandM., Bharatkumar Sharma, G. Sudhakaran, V. Ashok
{"title":"自适应笛卡尔网格CFD求解器在当前处理器体系结构中的加速","authors":"V. HarichandM., Bharatkumar Sharma, G. Sudhakaran, V. Ashok","doi":"10.1109/HiPC.2018.00025","DOIUrl":null,"url":null,"abstract":"In this paper, the challenges involved in the acceleration of an adaptive Cartesian Mesh CFD Solver PARAS-3D in the current generation processors(CPUs & GPUs) is explored. CFD codes are known for their memory bound nature, which remains as a significant bottle-neck in achieving higher performance. Adaptive Cartesian meshes with their oct-tree structure brings about more challenges in data parallelism. Moreover, Cartesian mesh solvers have higher memory band-width requirements due to their larger and varying stencil. The paper will detail how a re-design and implementation of a legacy Cartesian mesh CFD solver helped in achieving higher performance in CPUs by improvements in algorithms and data structures. Moreover, very good scalability to thousands of cores was achieved using asynchronous communication and weighted graph partitioning. A Structure of Array based data layout along with GPU features like Unified memory and Multi Process Service was used in the GPU acceleration process to obtain a performance of 4.4 X on top of the CPU only version by using nVidia Quadro GV100 GPUs.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Acceleration of an Adaptive Cartesian Mesh CFD Solver in the Current Generation Processor Architectures\",\"authors\":\"V. HarichandM., Bharatkumar Sharma, G. Sudhakaran, V. Ashok\",\"doi\":\"10.1109/HiPC.2018.00025\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, the challenges involved in the acceleration of an adaptive Cartesian Mesh CFD Solver PARAS-3D in the current generation processors(CPUs & GPUs) is explored. CFD codes are known for their memory bound nature, which remains as a significant bottle-neck in achieving higher performance. Adaptive Cartesian meshes with their oct-tree structure brings about more challenges in data parallelism. Moreover, Cartesian mesh solvers have higher memory band-width requirements due to their larger and varying stencil. The paper will detail how a re-design and implementation of a legacy Cartesian mesh CFD solver helped in achieving higher performance in CPUs by improvements in algorithms and data structures. Moreover, very good scalability to thousands of cores was achieved using asynchronous communication and weighted graph partitioning. A Structure of Array based data layout along with GPU features like Unified memory and Multi Process Service was used in the GPU acceleration process to obtain a performance of 4.4 X on top of the CPU only version by using nVidia Quadro GV100 GPUs.\",\"PeriodicalId\":113335,\"journal\":{\"name\":\"2018 IEEE 25th International Conference on High Performance Computing (HiPC)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 25th International Conference on High Performance Computing (HiPC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HiPC.2018.00025\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC.2018.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文探讨了自适应笛卡尔网格CFD求解器PARAS-3D在当前一代处理器(cpu和gpu)中加速所面临的挑战。众所周知,CFD代码具有内存约束的特性,这仍然是实现更高性能的一个重要瓶颈。自适应笛卡尔网格的八叉树结构给数据并行性带来了更多挑战。此外,笛卡尔网格求解器由于其更大且多变的模板而具有更高的存储带宽要求。本文将详细介绍如何重新设计和实现传统的笛卡尔网格CFD求解器,通过改进算法和数据结构,帮助在cpu上实现更高的性能。此外,通过异步通信和加权图分区实现了数千核的良好可扩展性。在GPU加速过程中使用了基于Array的数据布局结构以及统一内存和多进程服务等GPU特性,通过使用nVidia Quadro GV100 GPU,在仅CPU版本上获得4.4 X的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Acceleration of an Adaptive Cartesian Mesh CFD Solver in the Current Generation Processor Architectures
In this paper, the challenges involved in the acceleration of an adaptive Cartesian Mesh CFD Solver PARAS-3D in the current generation processors(CPUs & GPUs) is explored. CFD codes are known for their memory bound nature, which remains as a significant bottle-neck in achieving higher performance. Adaptive Cartesian meshes with their oct-tree structure brings about more challenges in data parallelism. Moreover, Cartesian mesh solvers have higher memory band-width requirements due to their larger and varying stencil. The paper will detail how a re-design and implementation of a legacy Cartesian mesh CFD solver helped in achieving higher performance in CPUs by improvements in algorithms and data structures. Moreover, very good scalability to thousands of cores was achieved using asynchronous communication and weighted graph partitioning. A Structure of Array based data layout along with GPU features like Unified memory and Multi Process Service was used in the GPU acceleration process to obtain a performance of 4.4 X on top of the CPU only version by using nVidia Quadro GV100 GPUs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信