Assessing the Memory Wall in Complex Codes

2022 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC) Pub Date : 2022-11-01 DOI:10.1109/MCHPC56545.2022.00009

G. Shipman, Jered Dominguez-Trujillo, K. Sheridan, S. Swaminarayan

{"title":"Assessing the Memory Wall in Complex Codes","authors":"G. Shipman, Jered Dominguez-Trujillo, K. Sheridan, S. Swaminarayan","doi":"10.1109/MCHPC56545.2022.00009","DOIUrl":null,"url":null,"abstract":"Many of Los Alamos National Laboratory’s (LANL) High Performance Computing (HPC) codes are heavily memory bandwidth bound. These codes often exhibit high levels of sparse memory access which differ significantly from industry standard benchmarks such as STREAM and GUPS. In this paper we present an analysis of some of our most important code-bases and their memory access patterns. From this analysis we generate representative micro-benchmarks that preserve the memory access characteristics of our codes using two approaches, one based on statistical sampling of relative memory offsets in a sliding time window at the function level and another at the loop level. The function level approach is used to assess the impact of advanced memory technologies such as LPDDR5 and HBM3 using the gem5 [1] simulator. Our simulation results show significant improvements for sparse memory access workloads using HBM3 relative to LPDDR5 and better scaling on a per core basis. Assessment of two different CPU architectures show that significantly higher peak memory bandwidth results in high bandwidth on sparse workloads. These two assessments demonstrate the benefits of this workload characterization technique in memory system design and evaluation.","PeriodicalId":171254,"journal":{"name":"2022 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MCHPC56545.2022.00009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Many of Los Alamos National Laboratory’s (LANL) High Performance Computing (HPC) codes are heavily memory bandwidth bound. These codes often exhibit high levels of sparse memory access which differ significantly from industry standard benchmarks such as STREAM and GUPS. In this paper we present an analysis of some of our most important code-bases and their memory access patterns. From this analysis we generate representative micro-benchmarks that preserve the memory access characteristics of our codes using two approaches, one based on statistical sampling of relative memory offsets in a sliding time window at the function level and another at the loop level. The function level approach is used to assess the impact of advanced memory technologies such as LPDDR5 and HBM3 using the gem5 [1] simulator. Our simulation results show significant improvements for sparse memory access workloads using HBM3 relative to LPDDR5 and better scaling on a per core basis. Assessment of two different CPU architectures show that significantly higher peak memory bandwidth results in high bandwidth on sparse workloads. These two assessments demonstrate the benefits of this workload characterization technique in memory system design and evaluation.

查看原文本刊更多论文

评估复杂代码中的内存墙

许多洛斯阿拉莫斯国家实验室(LANL)的高性能计算(HPC)代码都有很大的内存带宽限制。这些代码通常表现出高水平的稀疏内存访问，这与STREAM和GUPS等行业标准基准有很大不同。在本文中，我们分析了一些最重要的代码库及其内存访问模式。从这个分析中，我们生成了具有代表性的微基准，这些基准使用两种方法来保持代码的内存访问特性，一种方法基于函数级滑动时间窗口中相对内存偏移的统计抽样，另一种方法基于循环级。功能级方法用于使用gem5[1]模拟器评估LPDDR5和HBM3等高级存储技术的影响。我们的模拟结果显示，相对于LPDDR5，使用HBM3对稀疏内存访问工作负载有显著的改进，并且在每个核心的基础上有更好的扩展。对两种不同CPU架构的评估表明，显著较高的峰值内存带宽会在稀疏工作负载上产生高带宽。这两个评估证明了这种工作负载表征技术在内存系统设计和评估中的好处。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC)

自引率

0.00%

发文量