{"title":"在Lucata Pathfinder-A计算机上实现稀疏线性代数核","authors":"Géraud Krawezik, Shannon K. Kuntz, P. Kogge","doi":"10.1109/HPEC43674.2020.9286207","DOIUrl":null,"url":null,"abstract":"We present the implementation of two sparse linear algebra kernels on a migratory memory-side processing architecture. The first is the Sparse Matrix-Vector (SpMV) multiplication, and the second is the Symmetric Gauss-Seidel (SymGS) method. Both were chosen as they account for the largest run time of the HPCG benchmark. We introduce the system used for the experiments, as well as its programming model and key aspects to get the most performance from it. We describe the data distribution used to allow an efficient parallelization of the algorithms, and their actual implementations. We then present hardware results and simulator traces to explain their behavior. We show an almost linear strong scaling with the code, and discuss future work and improvements.","PeriodicalId":168544,"journal":{"name":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Implementing Sparse Linear Algebra Kernels on the Lucata Pathfinder-A Computer\",\"authors\":\"Géraud Krawezik, Shannon K. Kuntz, P. Kogge\",\"doi\":\"10.1109/HPEC43674.2020.9286207\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present the implementation of two sparse linear algebra kernels on a migratory memory-side processing architecture. The first is the Sparse Matrix-Vector (SpMV) multiplication, and the second is the Symmetric Gauss-Seidel (SymGS) method. Both were chosen as they account for the largest run time of the HPCG benchmark. We introduce the system used for the experiments, as well as its programming model and key aspects to get the most performance from it. We describe the data distribution used to allow an efficient parallelization of the algorithms, and their actual implementations. We then present hardware results and simulator traces to explain their behavior. We show an almost linear strong scaling with the code, and discuss future work and improvements.\",\"PeriodicalId\":168544,\"journal\":{\"name\":\"2020 IEEE High Performance Extreme Computing Conference (HPEC)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE High Performance Extreme Computing Conference (HPEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPEC43674.2020.9286207\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC43674.2020.9286207","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Implementing Sparse Linear Algebra Kernels on the Lucata Pathfinder-A Computer
We present the implementation of two sparse linear algebra kernels on a migratory memory-side processing architecture. The first is the Sparse Matrix-Vector (SpMV) multiplication, and the second is the Symmetric Gauss-Seidel (SymGS) method. Both were chosen as they account for the largest run time of the HPCG benchmark. We introduce the system used for the experiments, as well as its programming model and key aspects to get the most performance from it. We describe the data distribution used to allow an efficient parallelization of the algorithms, and their actual implementations. We then present hardware results and simulator traces to explain their behavior. We show an almost linear strong scaling with the code, and discuss future work and improvements.