Nenad Anchev, M. Gusev, S. Ristov, Blagoj Atanasovski
{"title":"Some optimization techniques of the matrix multiplication algorithm","authors":"Nenad Anchev, M. Gusev, S. Ristov, Blagoj Atanasovski","doi":"10.2498/iti.2013.0572","DOIUrl":null,"url":null,"abstract":"Dense matrix-matrix multiplication algorithm is widely used in large scientific applications, and often it is an important factor of the overall performance of the application. Therefore, optimizing this algorithm, both for parallel and serial execution would give an overall performance boost. In this paper we overview the most used dense matrix multiplication optimization techniques applicable for multicore processors. These methods can speedup the multicore parallel execution focusing on reducing the number of memory accesses and improving the algorithm according to hardware architecture and organization.","PeriodicalId":262789,"journal":{"name":"Proceedings of the ITI 2013 35th International Conference on Information Technology Interfaces","volume":"223 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ITI 2013 35th International Conference on Information Technology Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2498/iti.2013.0572","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Dense matrix-matrix multiplication algorithm is widely used in large scientific applications, and often it is an important factor of the overall performance of the application. Therefore, optimizing this algorithm, both for parallel and serial execution would give an overall performance boost. In this paper we overview the most used dense matrix multiplication optimization techniques applicable for multicore processors. These methods can speedup the multicore parallel execution focusing on reducing the number of memory accesses and improving the algorithm according to hardware architecture and organization.