{"title":"Accelerating frequency-domain simulations using small shared-memory CPU/GPU cluster","authors":"T. Topa, A. Noga, A. Karwowski","doi":"10.1109/MIKON.2016.7492098","DOIUrl":null,"url":null,"abstract":"Numerical approach to frequency response problems usually requires that the system governing equation is solved repeatedly at many frequencies. The computational efficiency of the overall process can be increased by departing from traditional sequential computing model in favor of utilizing the parallel processing capability commonly offered by modern hardware. In this paper, we consider a hybrid programming pattern, OpenMP + CUDA, from the perspective of a user of a rather typical low-cost multi-core CPU-based workstation that can accommodate up to four GPUs. Such the small-scale heterogeneous platforms have recently gained wide popularity in scientific computing as an inexpensive massively parallel architecture. The relevant programming model issues and performance questions are addressed. Experimental results for the example physics problem, that is, the electromagnetic scattering from perfectly electrically conducting body, show that significant performance improvement can be attained with the OpenMP + CUDA programming model.","PeriodicalId":354299,"journal":{"name":"2016 21st International Conference on Microwave, Radar and Wireless Communications (MIKON)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 21st International Conference on Microwave, Radar and Wireless Communications (MIKON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MIKON.2016.7492098","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Numerical approach to frequency response problems usually requires that the system governing equation is solved repeatedly at many frequencies. The computational efficiency of the overall process can be increased by departing from traditional sequential computing model in favor of utilizing the parallel processing capability commonly offered by modern hardware. In this paper, we consider a hybrid programming pattern, OpenMP + CUDA, from the perspective of a user of a rather typical low-cost multi-core CPU-based workstation that can accommodate up to four GPUs. Such the small-scale heterogeneous platforms have recently gained wide popularity in scientific computing as an inexpensive massively parallel architecture. The relevant programming model issues and performance questions are addressed. Experimental results for the example physics problem, that is, the electromagnetic scattering from perfectly electrically conducting body, show that significant performance improvement can be attained with the OpenMP + CUDA programming model.