Generalised vectorisation for sparse matrix: vector multiplication

A. N. Yzelman
{"title":"Generalised vectorisation for sparse matrix: vector multiplication","authors":"A. N. Yzelman","doi":"10.1145/2833179.2833185","DOIUrl":null,"url":null,"abstract":"This work generalises the various ways in which a sparse matrix--vector (SpMV) multiplication can be vectorised. It arrives at a novel data structure that generalises three earlier well-known data structures for sparse computations: the Blocked CRS format, the (sliced) ELLPACK format, and segmented scan based formats. The new data structure is relevant since efficient use of new hardware requires the use of increasingly wide vector registers. Normally, the use of vectorisation for sparse computations is limited due to bandwidth constraints. In cases where computations are limited by memory latencies instead of memory bandwidth, however, vectorisation can still help performance. The Intel Xeon Phi, appearing as a component in several top-500 supercomputers, displays exactly this behaviour for SpMV multiplication. On this architecture the use of the new generalised vectorisation scheme increases performance up to 178 percent.","PeriodicalId":215872,"journal":{"name":"Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2833179.2833185","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

This work generalises the various ways in which a sparse matrix--vector (SpMV) multiplication can be vectorised. It arrives at a novel data structure that generalises three earlier well-known data structures for sparse computations: the Blocked CRS format, the (sliced) ELLPACK format, and segmented scan based formats. The new data structure is relevant since efficient use of new hardware requires the use of increasingly wide vector registers. Normally, the use of vectorisation for sparse computations is limited due to bandwidth constraints. In cases where computations are limited by memory latencies instead of memory bandwidth, however, vectorisation can still help performance. The Intel Xeon Phi, appearing as a component in several top-500 supercomputers, displays exactly this behaviour for SpMV multiplication. On this architecture the use of the new generalised vectorisation scheme increases performance up to 178 percent.
稀疏矩阵的广义向量化:向量乘法
这项工作推广了稀疏矩阵向量(SpMV)乘法可以矢量化的各种方法。它得出了一种新的数据结构,它概括了早先已知的用于稀疏计算的三种数据结构:阻塞CRS格式、(切片)ELLPACK格式和基于分段扫描的格式。新的数据结构是相关的,因为有效地使用新的硬件需要使用越来越宽的向量寄存器。通常,由于带宽限制,稀疏计算中向量化的使用受到限制。然而,在计算受到内存延迟而不是内存带宽限制的情况下,向量化仍然可以帮助提高性能。英特尔Xeon Phi处理器,作为几个500强超级计算机的组件,在SpMV乘法中表现出了完全相同的行为。在这种架构上,使用新的广义向量化方案将性能提高了178%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信