Accelerating Weather Prediction Using Near-Memory Reconfigurable Fabric

Gagandeep Singh, D. Diamantopoulos, Juan G'omez-Luna, C. Hagleitner, S. Stuijk, H. Corporaal, O. Mutlu
{"title":"Accelerating Weather Prediction Using Near-Memory Reconfigurable Fabric","authors":"Gagandeep Singh, D. Diamantopoulos, Juan G'omez-Luna, C. Hagleitner, S. Stuijk, H. Corporaal, O. Mutlu","doi":"10.1145/3501804","DOIUrl":null,"url":null,"abstract":"Ongoing climate change calls for fast and accurate weather and climate modeling. However, when solving large-scale weather prediction simulations, state-of-the-art CPU and GPU implementations suffer from limited performance and high energy consumption. These implementations are dominated by complex irregular memory access patterns and low arithmetic intensity that pose fundamental challenges to acceleration. To overcome these challenges, we propose and evaluate the use of near-memory acceleration using a reconfigurable fabric with high-bandwidth memory (HBM). We focus on compound stencils that are fundamental kernels in weather prediction models. By using high-level synthesis techniques, we develop NERO, an field-programmable gate array+HBM-based accelerator connected through Open Coherent Accelerator Processor Interface to an IBM POWER9 host system. Our experimental results show that NERO outperforms a 16-core POWER9 system by \\( 5.3\\times \\) and \\( 12.7\\times \\) when running two different compound stencil kernels. NERO reduces the energy consumption by \\( 12\\times \\) and \\( 35\\times \\) for the same two kernels over the POWER9 system with an energy efficiency of 1.61 GFLOPS/W and 21.01 GFLOPS/W. We conclude that employing near-memory acceleration solutions for weather prediction modeling is promising as a means to achieve both high performance and high energy efficiency.","PeriodicalId":162787,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3501804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

Ongoing climate change calls for fast and accurate weather and climate modeling. However, when solving large-scale weather prediction simulations, state-of-the-art CPU and GPU implementations suffer from limited performance and high energy consumption. These implementations are dominated by complex irregular memory access patterns and low arithmetic intensity that pose fundamental challenges to acceleration. To overcome these challenges, we propose and evaluate the use of near-memory acceleration using a reconfigurable fabric with high-bandwidth memory (HBM). We focus on compound stencils that are fundamental kernels in weather prediction models. By using high-level synthesis techniques, we develop NERO, an field-programmable gate array+HBM-based accelerator connected through Open Coherent Accelerator Processor Interface to an IBM POWER9 host system. Our experimental results show that NERO outperforms a 16-core POWER9 system by \( 5.3\times \) and \( 12.7\times \) when running two different compound stencil kernels. NERO reduces the energy consumption by \( 12\times \) and \( 35\times \) for the same two kernels over the POWER9 system with an energy efficiency of 1.61 GFLOPS/W and 21.01 GFLOPS/W. We conclude that employing near-memory acceleration solutions for weather prediction modeling is promising as a means to achieve both high performance and high energy efficiency.
使用近内存可重构结构加速天气预报
持续的气候变化需要快速和准确的天气和气候模型。然而,在解决大规模天气预报模拟时,最先进的CPU和GPU实现受到性能限制和高能耗的影响。这些实现以复杂的不规则内存访问模式和低算术强度为主,这对加速构成了根本性的挑战。为了克服这些挑战,我们提出并评估了使用具有高带宽存储器(HBM)的可重构结构的近内存加速的使用。我们专注于复合模板,这是天气预测模型的基本核心。利用高级合成技术,我们开发了NERO,一种基于现场可编程门阵列+ hbm的加速器,通过开放相干加速器处理器接口连接到IBM POWER9主机系统。我们的实验结果表明,当运行两种不同的复合模板内核时,NERO的性能比16核POWER9系统高出\( 5.3\times \)和\( 12.7\times \)。与POWER9系统相比,NERO为相同的两个内核减少了\( 12\times \)和\( 35\times \)的能耗,能源效率分别为1.61 GFLOPS/W和21.01 GFLOPS/W。我们得出的结论是,在天气预报建模中采用近内存加速解决方案是一种很有希望实现高性能和高能效的手段。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信