{"title":"一种加速稀疏矩阵-向量乘法的新方法","authors":"P. Tvrdík, I. Šimeček","doi":"10.1109/SYNASC.2006.4","DOIUrl":null,"url":null,"abstract":"Sparse matrix-vector multiplication (shortly SpMtimesV) is one of most common subroutines in the numerical linear algebra. The problem is that the memory access patterns during the SpMtimesV are irregular and the utilization of cache can suffer from low spatial or temporal locality. This paper introduces new approach for the acceleration the SpMtimesV. This approach consists of 3 steps. The first step divides the whole matrix into smaller parts (regions) those can fit in the cache. The second step improves locality during the multiplication due to better utilization of distant references. The last step maximizes machine computation performance of the partial multiplication for each region. In this paper, we describe aspects of these 3 steps in more detail (including fast and time-inexpensive algorithms for all steps). Our measurements proved that our approach gives a significant speedup for almost all matrices arising from various technical areas","PeriodicalId":309740,"journal":{"name":"2006 Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2006-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"A New Approach for Accelerating the Sparse Matrix-Vector Multiplication\",\"authors\":\"P. Tvrdík, I. Šimeček\",\"doi\":\"10.1109/SYNASC.2006.4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sparse matrix-vector multiplication (shortly SpMtimesV) is one of most common subroutines in the numerical linear algebra. The problem is that the memory access patterns during the SpMtimesV are irregular and the utilization of cache can suffer from low spatial or temporal locality. This paper introduces new approach for the acceleration the SpMtimesV. This approach consists of 3 steps. The first step divides the whole matrix into smaller parts (regions) those can fit in the cache. The second step improves locality during the multiplication due to better utilization of distant references. The last step maximizes machine computation performance of the partial multiplication for each region. In this paper, we describe aspects of these 3 steps in more detail (including fast and time-inexpensive algorithms for all steps). Our measurements proved that our approach gives a significant speedup for almost all matrices arising from various technical areas\",\"PeriodicalId\":309740,\"journal\":{\"name\":\"2006 Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SYNASC.2006.4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SYNASC.2006.4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A New Approach for Accelerating the Sparse Matrix-Vector Multiplication
Sparse matrix-vector multiplication (shortly SpMtimesV) is one of most common subroutines in the numerical linear algebra. The problem is that the memory access patterns during the SpMtimesV are irregular and the utilization of cache can suffer from low spatial or temporal locality. This paper introduces new approach for the acceleration the SpMtimesV. This approach consists of 3 steps. The first step divides the whole matrix into smaller parts (regions) those can fit in the cache. The second step improves locality during the multiplication due to better utilization of distant references. The last step maximizes machine computation performance of the partial multiplication for each region. In this paper, we describe aspects of these 3 steps in more detail (including fast and time-inexpensive algorithms for all steps). Our measurements proved that our approach gives a significant speedup for almost all matrices arising from various technical areas