在实际内存处理体系结构上的高效稀疏矩阵向量乘法

Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems Pub Date : 2022-06-06 DOI:10.1145/3489048.3522661

Christina Giannoula, Ivan Fernandez, Juan Gómez-Luna, N. Koziris, G. Goumas, O. Mutlu

{"title":"在实际内存处理体系结构上的高效稀疏矩阵向量乘法","authors":"Christina Giannoula, Ivan Fernandez, Juan Gómez-Luna, N. Koziris, G. Goumas, O. Mutlu","doi":"10.1145/3489048.3522661","DOIUrl":null,"url":null,"abstract":"Several manufacturers have already started to commercialize near-bank Processing-In-Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures place simple cores close to DRAM banks. Recent research demonstrates that they can yield significant performance and energy improvements in parallel applications by alleviating data access costs. Real PIM systems can provide high levels of parallelism, large aggregate memory bandwidth and low memory access latency, thereby being a good fit to accelerate the Sparse Matrix Vector Multiplication (SpMV) kernel. SpMV has been characterized as one of the most significant and thoroughly studied scientific computation kernels. It is primarily a memory-bound kernel with intensive memory accesses due its algorithmic nature, the compressed matrix format used, and the sparsity patterns of the input matrices given. This paper provides the first comprehensive analysis of SpMV on a real-world PIM architecture, and presents SparseP, the first SpMV library for real PIM architectures. We make two key contributions. First, we design efficient SpMV algorithms to accelerate the SpMV kernel in current and future PIM systems, while covering a wide variety of sparse matrices with diverse sparsity patterns. Second, we provide the first comprehensive analysis of SpMV on a real PIM architecture. Specifically, we conduct our rigorous experimental analysis of SpMV kernels in the UPMEM PIM system, the first publicly-available real-world PIM architecture. Our extensive evaluation provides new insights and recommendations for software designers and hardware architects to efficiently accelerate the SpMV kernel on real PIM systems. For more information about our thorough characterization on the SpMV PIM execution, results, insights and the open-source SparseP software package [21], we refer the reader to the full version of the paper [3, 4]. The SparseP software package is publicly and freely available at https://github.com/CMU-SAFARI/SparseP.","PeriodicalId":264598,"journal":{"name":"Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":"{\"title\":\"Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures\",\"authors\":\"Christina Giannoula, Ivan Fernandez, Juan Gómez-Luna, N. Koziris, G. Goumas, O. Mutlu\",\"doi\":\"10.1145/3489048.3522661\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Several manufacturers have already started to commercialize near-bank Processing-In-Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures place simple cores close to DRAM banks. Recent research demonstrates that they can yield significant performance and energy improvements in parallel applications by alleviating data access costs. Real PIM systems can provide high levels of parallelism, large aggregate memory bandwidth and low memory access latency, thereby being a good fit to accelerate the Sparse Matrix Vector Multiplication (SpMV) kernel. SpMV has been characterized as one of the most significant and thoroughly studied scientific computation kernels. It is primarily a memory-bound kernel with intensive memory accesses due its algorithmic nature, the compressed matrix format used, and the sparsity patterns of the input matrices given. This paper provides the first comprehensive analysis of SpMV on a real-world PIM architecture, and presents SparseP, the first SpMV library for real PIM architectures. We make two key contributions. First, we design efficient SpMV algorithms to accelerate the SpMV kernel in current and future PIM systems, while covering a wide variety of sparse matrices with diverse sparsity patterns. Second, we provide the first comprehensive analysis of SpMV on a real PIM architecture. Specifically, we conduct our rigorous experimental analysis of SpMV kernels in the UPMEM PIM system, the first publicly-available real-world PIM architecture. Our extensive evaluation provides new insights and recommendations for software designers and hardware architects to efficiently accelerate the SpMV kernel on real PIM systems. For more information about our thorough characterization on the SpMV PIM execution, results, insights and the open-source SparseP software package [21], we refer the reader to the full version of the paper [3, 4]. The SparseP software package is publicly and freely available at https://github.com/CMU-SAFARI/SparseP.\",\"PeriodicalId\":264598,\"journal\":{\"name\":\"Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"37\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3489048.3522661\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3489048.3522661","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 37

摘要

经过几十年的研究努力，一些制造商已经开始将近银行内存处理(PIM)架构商业化。近库PIM架构将简单内核放置在DRAM库附近。最近的研究表明，通过降低数据访问成本，它们可以显著提高并行应用程序的性能和能耗。真正的PIM系统可以提供高水平的并行性、大的聚合内存带宽和低的内存访问延迟，因此非常适合加速稀疏矩阵向量乘法(SpMV)内核。SpMV是目前研究最深入、最重要的科学计算核之一。它主要是一个内存受限的内核，由于其算法性质、使用的压缩矩阵格式和给定的输入矩阵的稀疏模式，它具有密集的内存访问。本文首次在实际的PIM体系结构上对SpMV进行了全面的分析，并提出了SparseP，这是第一个用于实际PIM体系结构的SpMV库。我们做出了两项重要贡献。首先，我们设计了高效的SpMV算法，在当前和未来的PIM系统中加速SpMV内核，同时涵盖了具有不同稀疏模式的各种稀疏矩阵。其次，我们首次在一个真实的PIM架构上对SpMV进行了全面分析。具体来说，我们对UPMEM PIM系统中的SpMV内核进行了严格的实验分析，这是第一个公开可用的实际PIM体系结构。我们广泛的评估为软件设计人员和硬件架构师提供了新的见解和建议，以便在实际的PIM系统上有效地加速SpMV内核。有关我们对SpMV PIM执行、结果、见解和开源SparseP软件包的全面描述的更多信息[21]，我们建议读者参阅论文的完整版本[3,4]。SparseP软件包可在https://github.com/CMU-SAFARI/SparseP上免费公开获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures

Several manufacturers have already started to commercialize near-bank Processing-In-Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures place simple cores close to DRAM banks. Recent research demonstrates that they can yield significant performance and energy improvements in parallel applications by alleviating data access costs. Real PIM systems can provide high levels of parallelism, large aggregate memory bandwidth and low memory access latency, thereby being a good fit to accelerate the Sparse Matrix Vector Multiplication (SpMV) kernel. SpMV has been characterized as one of the most significant and thoroughly studied scientific computation kernels. It is primarily a memory-bound kernel with intensive memory accesses due its algorithmic nature, the compressed matrix format used, and the sparsity patterns of the input matrices given. This paper provides the first comprehensive analysis of SpMV on a real-world PIM architecture, and presents SparseP, the first SpMV library for real PIM architectures. We make two key contributions. First, we design efficient SpMV algorithms to accelerate the SpMV kernel in current and future PIM systems, while covering a wide variety of sparse matrices with diverse sparsity patterns. Second, we provide the first comprehensive analysis of SpMV on a real PIM architecture. Specifically, we conduct our rigorous experimental analysis of SpMV kernels in the UPMEM PIM system, the first publicly-available real-world PIM architecture. Our extensive evaluation provides new insights and recommendations for software designers and hardware architects to efficiently accelerate the SpMV kernel on real PIM systems. For more information about our thorough characterization on the SpMV PIM execution, results, insights and the open-source SparseP software package [21], we refer the reader to the full version of the paper [3, 4]. The SparseP software package is publicly and freely available at https://github.com/CMU-SAFARI/SparseP.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems

自引率

0.00%

发文量