RaiderSTREAM:使STREAM基准适应现代HPC系统

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI:10.1109/HPEC55821.2022.9926292

Michael Beebe, Brody Williams, Stephen Devaney, John D. Leidel, Yong Chen, Stephen Poole

{"title":"RaiderSTREAM:使STREAM基准适应现代HPC系统","authors":"Michael Beebe, Brody Williams, Stephen Devaney, John D. Leidel, Yong Chen, Stephen Poole","doi":"10.1109/HPEC55821.2022.9926292","DOIUrl":null,"url":null,"abstract":"Sustaining high memory bandwidth utilization is a common bottleneck to maximizing the performance of scien-tific applications, with the dominating factor of the runtime being the speed at which data can be loaded from memory into the CPU and results can be written back to memory, particularly for increasingly critical data-intensive workloads. The prevalence of irregular memory access patterns within these applications, exemplified by kernels such as those found in sparse matrix and graph applications, significantly degrade the achievable performance of a system's memory hierarchy. As such, it is highly desirable to be able to accurately measure a given memory hierarchy's sustainable memory bandwidth when designing applications as well as future high-performance computing (HPC) systems. STREAM is a de facto standard benchmark for measuring sustained memory bandwidth and has garnered widespread adoption. In this work, we discuss current limitations of the STREAM benchmark in the context of high-performance and scientific computing. We then introduce a new version of STREAM, called RaiderSTREAM, built on the OpenSHMEM and MPI programming models in tandem with OpenMP, that include additional kernels which better model irregular memory access patterns in order to address these shortcomings.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"73 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RaiderSTREAM: Adapting the STREAM Benchmark to Modern HPC Systems\",\"authors\":\"Michael Beebe, Brody Williams, Stephen Devaney, John D. Leidel, Yong Chen, Stephen Poole\",\"doi\":\"10.1109/HPEC55821.2022.9926292\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sustaining high memory bandwidth utilization is a common bottleneck to maximizing the performance of scien-tific applications, with the dominating factor of the runtime being the speed at which data can be loaded from memory into the CPU and results can be written back to memory, particularly for increasingly critical data-intensive workloads. The prevalence of irregular memory access patterns within these applications, exemplified by kernels such as those found in sparse matrix and graph applications, significantly degrade the achievable performance of a system's memory hierarchy. As such, it is highly desirable to be able to accurately measure a given memory hierarchy's sustainable memory bandwidth when designing applications as well as future high-performance computing (HPC) systems. STREAM is a de facto standard benchmark for measuring sustained memory bandwidth and has garnered widespread adoption. In this work, we discuss current limitations of the STREAM benchmark in the context of high-performance and scientific computing. We then introduce a new version of STREAM, called RaiderSTREAM, built on the OpenSHMEM and MPI programming models in tandem with OpenMP, that include additional kernels which better model irregular memory access patterns in order to address these shortcomings.\",\"PeriodicalId\":200071,\"journal\":{\"name\":\"2022 IEEE High Performance Extreme Computing Conference (HPEC)\",\"volume\":\"73 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE High Performance Extreme Computing Conference (HPEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPEC55821.2022.9926292\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC55821.2022.9926292","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

维持高内存带宽利用率是最大限度提高科学应用程序性能的常见瓶颈，运行时的主要因素是将数据从内存加载到CPU和将结果写回内存的速度，特别是对于日益关键的数据密集型工作负载。这些应用程序中普遍存在不规则的内存访问模式，例如稀疏矩阵和图形应用程序中的内核，这极大地降低了系统内存层次结构的可实现性能。因此，在设计应用程序以及未来的高性能计算(HPC)系统时，非常希望能够准确地测量给定内存层次结构的可持续内存带宽。STREAM实际上是测量持续内存带宽的标准基准，并且已经获得了广泛的采用。在这项工作中，我们讨论了STREAM基准在高性能和科学计算背景下的当前局限性。然后，我们介绍了一个新版本的STREAM，称为RaiderSTREAM，它建立在OpenSHMEM和MPI编程模型与OpenMP的结合上，其中包括额外的内核，可以更好地模拟不规则的内存访问模式，以解决这些缺点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

RaiderSTREAM: Adapting the STREAM Benchmark to Modern HPC Systems

Sustaining high memory bandwidth utilization is a common bottleneck to maximizing the performance of scien-tific applications, with the dominating factor of the runtime being the speed at which data can be loaded from memory into the CPU and results can be written back to memory, particularly for increasingly critical data-intensive workloads. The prevalence of irregular memory access patterns within these applications, exemplified by kernels such as those found in sparse matrix and graph applications, significantly degrade the achievable performance of a system's memory hierarchy. As such, it is highly desirable to be able to accurately measure a given memory hierarchy's sustainable memory bandwidth when designing applications as well as future high-performance computing (HPC) systems. STREAM is a de facto standard benchmark for measuring sustained memory bandwidth and has garnered widespread adoption. In this work, we discuss current limitations of the STREAM benchmark in the context of high-performance and scientific computing. We then introduce a new version of STREAM, called RaiderSTREAM, built on the OpenSHMEM and MPI programming models in tandem with OpenMP, that include additional kernels which better model irregular memory access patterns in order to address these shortcomings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE High Performance Extreme Computing Conference (HPEC)

自引率

0.00%

发文量