Yiming Wei, Xin You, Hailong Yang, Zhongzhi Luan, D. Qian
{"title":"Towards GPU Acceleration of Phonon Computation with ShengBTE","authors":"Yiming Wei, Xin You, Hailong Yang, Zhongzhi Luan, D. Qian","doi":"10.1145/3368474.3368487","DOIUrl":null,"url":null,"abstract":"ShengBTE is one of the software packages that are commonly used in the field of phonon computation (e.g., to determine the lattice thermal conductivity). ShengBTE simulates the phonon diffusion by solving the Boltzmann transport equations, which take long execution time to derive the simulation results due to the high computation complexity. This paper mainly focuses on the performance optimization of ShengBTE on GPU. We identify the performance bottlenecks of ShengBTE and propose corresponding optimizations such as loop-carried dependency elimination, hotspot function acceleration on GPU and performance tuning on thread block. The experiment results show that the proposed optimizations significantly improve the performance of ShengBTE, which achieves an average speedup of 9.06x and 13.74x on discrete temperature simulation and continuous temperature simulation respectively without losing accuracy.","PeriodicalId":314778,"journal":{"name":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3368474.3368487","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
ShengBTE is one of the software packages that are commonly used in the field of phonon computation (e.g., to determine the lattice thermal conductivity). ShengBTE simulates the phonon diffusion by solving the Boltzmann transport equations, which take long execution time to derive the simulation results due to the high computation complexity. This paper mainly focuses on the performance optimization of ShengBTE on GPU. We identify the performance bottlenecks of ShengBTE and propose corresponding optimizations such as loop-carried dependency elimination, hotspot function acceleration on GPU and performance tuning on thread block. The experiment results show that the proposed optimizations significantly improve the performance of ShengBTE, which achieves an average speedup of 9.06x and 13.74x on discrete temperature simulation and continuous temperature simulation respectively without losing accuracy.