RaiderSTREAM: Adapting the STREAM Benchmark to Modern HPC Systems

Michael Beebe, Brody Williams, Stephen Devaney, John D. Leidel, Yong Chen, Stephen Poole
{"title":"RaiderSTREAM: Adapting the STREAM Benchmark to Modern HPC Systems","authors":"Michael Beebe, Brody Williams, Stephen Devaney, John D. Leidel, Yong Chen, Stephen Poole","doi":"10.1109/HPEC55821.2022.9926292","DOIUrl":null,"url":null,"abstract":"Sustaining high memory bandwidth utilization is a common bottleneck to maximizing the performance of scien-tific applications, with the dominating factor of the runtime being the speed at which data can be loaded from memory into the CPU and results can be written back to memory, particularly for increasingly critical data-intensive workloads. The prevalence of irregular memory access patterns within these applications, exemplified by kernels such as those found in sparse matrix and graph applications, significantly degrade the achievable performance of a system's memory hierarchy. As such, it is highly desirable to be able to accurately measure a given memory hierarchy's sustainable memory bandwidth when designing applications as well as future high-performance computing (HPC) systems. STREAM is a de facto standard benchmark for measuring sustained memory bandwidth and has garnered widespread adoption. In this work, we discuss current limitations of the STREAM benchmark in the context of high-performance and scientific computing. We then introduce a new version of STREAM, called RaiderSTREAM, built on the OpenSHMEM and MPI programming models in tandem with OpenMP, that include additional kernels which better model irregular memory access patterns in order to address these shortcomings.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"73 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC55821.2022.9926292","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Sustaining high memory bandwidth utilization is a common bottleneck to maximizing the performance of scien-tific applications, with the dominating factor of the runtime being the speed at which data can be loaded from memory into the CPU and results can be written back to memory, particularly for increasingly critical data-intensive workloads. The prevalence of irregular memory access patterns within these applications, exemplified by kernels such as those found in sparse matrix and graph applications, significantly degrade the achievable performance of a system's memory hierarchy. As such, it is highly desirable to be able to accurately measure a given memory hierarchy's sustainable memory bandwidth when designing applications as well as future high-performance computing (HPC) systems. STREAM is a de facto standard benchmark for measuring sustained memory bandwidth and has garnered widespread adoption. In this work, we discuss current limitations of the STREAM benchmark in the context of high-performance and scientific computing. We then introduce a new version of STREAM, called RaiderSTREAM, built on the OpenSHMEM and MPI programming models in tandem with OpenMP, that include additional kernels which better model irregular memory access patterns in order to address these shortcomings.
RaiderSTREAM:使STREAM基准适应现代HPC系统
维持高内存带宽利用率是最大限度提高科学应用程序性能的常见瓶颈,运行时的主要因素是将数据从内存加载到CPU和将结果写回内存的速度,特别是对于日益关键的数据密集型工作负载。这些应用程序中普遍存在不规则的内存访问模式,例如稀疏矩阵和图形应用程序中的内核,这极大地降低了系统内存层次结构的可实现性能。因此,在设计应用程序以及未来的高性能计算(HPC)系统时,非常希望能够准确地测量给定内存层次结构的可持续内存带宽。STREAM实际上是测量持续内存带宽的标准基准,并且已经获得了广泛的采用。在这项工作中,我们讨论了STREAM基准在高性能和科学计算背景下的当前局限性。然后,我们介绍了一个新版本的STREAM,称为RaiderSTREAM,它建立在OpenSHMEM和MPI编程模型与OpenMP的结合上,其中包括额外的内核,可以更好地模拟不规则的内存访问模式,以解决这些缺点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信