基于gpu并行实现WRF 5层热扩散方案的进一步改进

Melin Huang, Bormin Huang, Jarno Mielikäinen, Hung-Lung Huang, M. Goldberg, A. Mehta
{"title":"基于gpu并行实现WRF 5层热扩散方案的进一步改进","authors":"Melin Huang, Bormin Huang, Jarno Mielikäinen, Hung-Lung Huang, M. Goldberg, A. Mehta","doi":"10.1109/ICPADS.2013.126","DOIUrl":null,"url":null,"abstract":"The Weather Research and Forecasting (WRF) model has been widely employed for weather prediction and atmospheric simulation with dual purposes in forecasting and research. Land-surface models (LSMs) are parts of the WRF model, which is used to provide information of heat and moisture fluxes over land and sea-ice points. The 5-layer thermal diffusion simulation is an LSM based on the MM5 soil temperature model with an energy budget made up of sensible, latent, and radiative heat fluxes. Owing to the feature of no interactions among horizontal grid points, the LSMs are very favorable for massively parallel processing. The study presented in this article demonstrates the parallel computing efforts on the WRF 5-layer thermal diffusion scheme using Graphics Processing Unit (GPU). Since this scheme is only one intermediate module of the entire WRF model, the involvement of the I/O transfer does not occur in the intermediate process. By employing one NVIDIA GTX 680 GPU in the case without I/O transfer, our optimization efforts on the GPU-based 5-layer thermal diffusion scheme can reach a speedup as high as 247.5x with respect to one CPU core, whereas the speedup for one CPU socket with respect to one CPU core is only 3.1x. We can even boost the speedup to 332x with respect to one CPU core when three GPUs are applied.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Further Improvement on GPU-Based Parallel Implementation of WRF 5-Layer Thermal Diffusion Scheme\",\"authors\":\"Melin Huang, Bormin Huang, Jarno Mielikäinen, Hung-Lung Huang, M. Goldberg, A. Mehta\",\"doi\":\"10.1109/ICPADS.2013.126\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Weather Research and Forecasting (WRF) model has been widely employed for weather prediction and atmospheric simulation with dual purposes in forecasting and research. Land-surface models (LSMs) are parts of the WRF model, which is used to provide information of heat and moisture fluxes over land and sea-ice points. The 5-layer thermal diffusion simulation is an LSM based on the MM5 soil temperature model with an energy budget made up of sensible, latent, and radiative heat fluxes. Owing to the feature of no interactions among horizontal grid points, the LSMs are very favorable for massively parallel processing. The study presented in this article demonstrates the parallel computing efforts on the WRF 5-layer thermal diffusion scheme using Graphics Processing Unit (GPU). Since this scheme is only one intermediate module of the entire WRF model, the involvement of the I/O transfer does not occur in the intermediate process. By employing one NVIDIA GTX 680 GPU in the case without I/O transfer, our optimization efforts on the GPU-based 5-layer thermal diffusion scheme can reach a speedup as high as 247.5x with respect to one CPU core, whereas the speedup for one CPU socket with respect to one CPU core is only 3.1x. We can even boost the speedup to 332x with respect to one CPU core when three GPUs are applied.\",\"PeriodicalId\":160979,\"journal\":{\"name\":\"2013 International Conference on Parallel and Distributed Systems\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Conference on Parallel and Distributed Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPADS.2013.126\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Parallel and Distributed Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPADS.2013.126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

天气研究与预报(WRF)模式被广泛应用于天气预报和大气模拟,具有预报和研究双重目的。陆地表面模式(lsm)是WRF模式的一部分,用于提供陆地和海冰点上的热量和水分通量信息。5层热扩散模拟是基于MM5土壤温度模型的LSM,能量收支由感热通量、潜热通量和辐射热通量组成。由于水平网格点之间没有相互作用的特点,lsm非常有利于大规模并行处理。本文的研究展示了使用图形处理单元(GPU)在WRF 5层热扩散方案上的并行计算工作。由于该方案只是整个WRF模型的一个中间模块,因此I/O传输不会发生在中间进程中。在不进行I/O传输的情况下,采用一个NVIDIA GTX 680 GPU,我们对基于GPU的5层热扩散方案的优化,相对于一个CPU核心的加速高达247.5倍,而一个CPU插槽相对于一个CPU核心的加速仅为3.1倍。当应用三个gpu时,我们甚至可以将一个CPU核心的速度提高到332倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Further Improvement on GPU-Based Parallel Implementation of WRF 5-Layer Thermal Diffusion Scheme
The Weather Research and Forecasting (WRF) model has been widely employed for weather prediction and atmospheric simulation with dual purposes in forecasting and research. Land-surface models (LSMs) are parts of the WRF model, which is used to provide information of heat and moisture fluxes over land and sea-ice points. The 5-layer thermal diffusion simulation is an LSM based on the MM5 soil temperature model with an energy budget made up of sensible, latent, and radiative heat fluxes. Owing to the feature of no interactions among horizontal grid points, the LSMs are very favorable for massively parallel processing. The study presented in this article demonstrates the parallel computing efforts on the WRF 5-layer thermal diffusion scheme using Graphics Processing Unit (GPU). Since this scheme is only one intermediate module of the entire WRF model, the involvement of the I/O transfer does not occur in the intermediate process. By employing one NVIDIA GTX 680 GPU in the case without I/O transfer, our optimization efforts on the GPU-based 5-layer thermal diffusion scheme can reach a speedup as high as 247.5x with respect to one CPU core, whereas the speedup for one CPU socket with respect to one CPU core is only 3.1x. We can even boost the speedup to 332x with respect to one CPU core when three GPUs are applied.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信