稀疏矩阵向量乘法中通信不平衡对性能影响的评估

G. Utrera, Marisa Gil, X. Martorell
{"title":"稀疏矩阵向量乘法中通信不平衡对性能影响的评估","authors":"G. Utrera, Marisa Gil, X. Martorell","doi":"10.1109/PDP.2015.37","DOIUrl":null,"url":null,"abstract":"HPC applications make intensive use of large sparse matrices with the matrix-vector product representing a significant fraction of the total run-time. These matrices are characterized by non-uniform matrix structures and irregular memory accesses that make it difficult to achieve a good scalability in modern HPC platforms with multi-or many-cores, SIMD and high-speed communication networks. One of the reasons for this drawback in scalability is caused by communication due to imbalance in both message synchronization and size. In this work we analyze such load imbalance in the sparse matrix vector product (SpMV) when running in a multi-node cluster using high-speed interconnection networks. The experimental alternatives to diminish communication load imbalance are evaluated on two programming models MPI+fork-join and MPI+task-based parallelism) using certain optimizations (i.e. computation-communication overlap or parallel send messages). The performance achieved for large matrix sizes can be up to 9%.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Evaluating the Performance Impact of Communication Imbalance in Sparse Matrix-Vector Multiplication\",\"authors\":\"G. Utrera, Marisa Gil, X. Martorell\",\"doi\":\"10.1109/PDP.2015.37\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"HPC applications make intensive use of large sparse matrices with the matrix-vector product representing a significant fraction of the total run-time. These matrices are characterized by non-uniform matrix structures and irregular memory accesses that make it difficult to achieve a good scalability in modern HPC platforms with multi-or many-cores, SIMD and high-speed communication networks. One of the reasons for this drawback in scalability is caused by communication due to imbalance in both message synchronization and size. In this work we analyze such load imbalance in the sparse matrix vector product (SpMV) when running in a multi-node cluster using high-speed interconnection networks. The experimental alternatives to diminish communication load imbalance are evaluated on two programming models MPI+fork-join and MPI+task-based parallelism) using certain optimizations (i.e. computation-communication overlap or parallel send messages). The performance achieved for large matrix sizes can be up to 9%.\",\"PeriodicalId\":285111,\"journal\":{\"name\":\"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDP.2015.37\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP.2015.37","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

高性能计算应用程序大量使用大型稀疏矩阵,其中矩阵-向量乘积占总运行时间的很大一部分。这些矩阵具有矩阵结构不均匀、存储访问不规范等特点,在现代多核或多核、SIMD和高速通信网络的高性能计算平台上难以实现良好的可扩展性。在可伸缩性方面出现这种缺陷的原因之一是由于消息同步和大小的不平衡造成的通信。本文分析了稀疏矩阵向量积(SpMV)在高速互连网络中运行时的负载不平衡问题。在两种编程模型(MPI+fork-join和MPI+基于任务的并行)上,使用一定的优化(即计算-通信重叠或并行发送消息)评估了减少通信负载不平衡的实验替代方案。大矩阵尺寸的性能可以达到9%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating the Performance Impact of Communication Imbalance in Sparse Matrix-Vector Multiplication
HPC applications make intensive use of large sparse matrices with the matrix-vector product representing a significant fraction of the total run-time. These matrices are characterized by non-uniform matrix structures and irregular memory accesses that make it difficult to achieve a good scalability in modern HPC platforms with multi-or many-cores, SIMD and high-speed communication networks. One of the reasons for this drawback in scalability is caused by communication due to imbalance in both message synchronization and size. In this work we analyze such load imbalance in the sparse matrix vector product (SpMV) when running in a multi-node cluster using high-speed interconnection networks. The experimental alternatives to diminish communication load imbalance are evaluated on two programming models MPI+fork-join and MPI+task-based parallelism) using certain optimizations (i.e. computation-communication overlap or parallel send messages). The performance achieved for large matrix sizes can be up to 9%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信