基于CUDA的流线仿真并行化

Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications Pub Date : 2019-03-08 DOI:10.1145/3318265.3318269

Mulan Luo, Xu-sheng Wang, Xiaohui Ji

{"title":"基于CUDA的流线仿真并行化","authors":"Mulan Luo, Xu-sheng Wang, Xiaohui Ji","doi":"10.1145/3318265.3318269","DOIUrl":null,"url":null,"abstract":"To accelerate the streamline simulation and satisfy the real-time demands, in this paper, we proposed a method based on GPUs to parallelize the streamline simulation. CUDA architecture was used to implement the parallel algorithm on a single GPU and a multi-GPU computer. In our method, a grid is organized into a 2D array of blocks, and all threads in a block are organized into a 1D array, such that each thread in a block computes one streamline. To implement the method on multiple GPUs, the physical cell model is divided into sub-models to make the number of sub-models equal to the number of GPUs. The algorithm is applied to a Tóthian basin as an example. The experimental analysis shows that the parallel algorithm based on different numbers of GPUs has different accelerations. For a single GPU, the speedup reaches 170 times; and for five GPUs, it is 808 times, for a physical model with 40×106 cells. The conclusion is that GPUs can greatly accelerate the streamline simulation.","PeriodicalId":241692,"journal":{"name":"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Parallelization of the streamline simulation based on CUDA\",\"authors\":\"Mulan Luo, Xu-sheng Wang, Xiaohui Ji\",\"doi\":\"10.1145/3318265.3318269\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To accelerate the streamline simulation and satisfy the real-time demands, in this paper, we proposed a method based on GPUs to parallelize the streamline simulation. CUDA architecture was used to implement the parallel algorithm on a single GPU and a multi-GPU computer. In our method, a grid is organized into a 2D array of blocks, and all threads in a block are organized into a 1D array, such that each thread in a block computes one streamline. To implement the method on multiple GPUs, the physical cell model is divided into sub-models to make the number of sub-models equal to the number of GPUs. The algorithm is applied to a Tóthian basin as an example. The experimental analysis shows that the parallel algorithm based on different numbers of GPUs has different accelerations. For a single GPU, the speedup reaches 170 times; and for five GPUs, it is 808 times, for a physical model with 40×106 cells. The conclusion is that GPUs can greatly accelerate the streamline simulation.\",\"PeriodicalId\":241692,\"journal\":{\"name\":\"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3318265.3318269\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3318265.3318269","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

为了加速流线仿真并满足实时性要求，本文提出了一种基于gpu的流线仿真并行化方法。采用CUDA架构在单GPU和多GPU计算机上实现并行算法。在我们的方法中，一个网格被组织成一个二维的块数组，一个块中的所有线程被组织成一个一维数组，这样一个块中的每个线程计算一个流线。为了在多gpu上实现该方法，将物理单元模型划分为子模型，使子模型的数量与gpu的数量相等。以Tóthian流域为例，给出了该算法的应用。实验分析表明，基于不同gpu数量的并行算法具有不同的加速度。对于单个GPU，加速达到170倍;对于5个gpu，对于具有40×106单元的物理模型，它是808倍。结论是gpu可以大大加快流线化仿真。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Parallelization of the streamline simulation based on CUDA

To accelerate the streamline simulation and satisfy the real-time demands, in this paper, we proposed a method based on GPUs to parallelize the streamline simulation. CUDA architecture was used to implement the parallel algorithm on a single GPU and a multi-GPU computer. In our method, a grid is organized into a 2D array of blocks, and all threads in a block are organized into a 1D array, such that each thread in a block computes one streamline. To implement the method on multiple GPUs, the physical cell model is divided into sub-models to make the number of sub-models equal to the number of GPUs. The algorithm is applied to a Tóthian basin as an example. The experimental analysis shows that the parallel algorithm based on different numbers of GPUs has different accelerations. For a single GPU, the speedup reaches 170 times; and for five GPUs, it is 808 times, for a physical model with 40×106 cells. The conclusion is that GPUs can greatly accelerate the streamline simulation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications

自引率

0.00%

发文量