一种基于MPI和CUDA的并行稀疏近似逆预处理算法

BenchCouncil Transactions on Benchmarks, Standards and Evaluations Pub Date : 2021-10-01 DOI:10.1016/j.tbench.2021.100007

Yizhou Wang, Wenhao Li, Jiaquan Gao

{"title":"一种基于MPI和CUDA的并行稀疏近似逆预处理算法","authors":"Yizhou Wang, Wenhao Li, Jiaquan Gao","doi":"10.1016/j.tbench.2021.100007","DOIUrl":null,"url":null,"abstract":"<div><p>In this study, we present an efficient parallel sparse approximate inverse (SPAI) preconditioning algorithm based on MPI and CUDA, called HybridSPAI. For HybridSPAI, it optimizes a latest static SPAI preconditioning algorithm, and is extended from one GPU to multiple GPUs in order to process large-scale matrices. We make the following significant contributions: (1) a general parallel framework for optimizing the static SPAI preconditioner based on MPI and CUDA is presented, and (2) for each component of the preconditioner, a decision tree is established to choose the optimal kernel of computing it. Experimental results show that HybridSPAI is effective, and outperforms the popular preconditioning algorithms in two public libraries, and a latest parallel SPAI preconditioning algorithm.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"1 1","pages":"Article 100007"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485921000077/pdfft?md5=acaf310d54e04f99040f007213bf2d56&pid=1-s2.0-S2772485921000077-main.pdf","citationCount":"2","resultStr":"{\"title\":\"A parallel sparse approximate inverse preconditioning algorithm based on MPI and CUDA\",\"authors\":\"Yizhou Wang, Wenhao Li, Jiaquan Gao\",\"doi\":\"10.1016/j.tbench.2021.100007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In this study, we present an efficient parallel sparse approximate inverse (SPAI) preconditioning algorithm based on MPI and CUDA, called HybridSPAI. For HybridSPAI, it optimizes a latest static SPAI preconditioning algorithm, and is extended from one GPU to multiple GPUs in order to process large-scale matrices. We make the following significant contributions: (1) a general parallel framework for optimizing the static SPAI preconditioner based on MPI and CUDA is presented, and (2) for each component of the preconditioner, a decision tree is established to choose the optimal kernel of computing it. Experimental results show that HybridSPAI is effective, and outperforms the popular preconditioning algorithms in two public libraries, and a latest parallel SPAI preconditioning algorithm.</p></div>\",\"PeriodicalId\":100155,\"journal\":{\"name\":\"BenchCouncil Transactions on Benchmarks, Standards and Evaluations\",\"volume\":\"1 1\",\"pages\":\"Article 100007\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772485921000077/pdfft?md5=acaf310d54e04f99040f007213bf2d56&pid=1-s2.0-S2772485921000077-main.pdf\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BenchCouncil Transactions on Benchmarks, Standards and Evaluations\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772485921000077\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772485921000077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在本研究中，我们提出了一种基于MPI和CUDA的高效并行稀疏近似逆(SPAI)预处理算法，称为HybridSPAI。对于HybridSPAI，它优化了一种最新的静态SPAI预处理算法，并将其从一个GPU扩展到多个GPU，以处理大规模矩阵。我们做出了以下重大贡献:(1)提出了一个基于MPI和CUDA的静态SPAI预条件优化通用并行框架;(2)对预条件的每个组成部分建立了决策树来选择计算它的最优核。实验结果表明，HybridSPAI是有效的，并且优于两个公共图书馆中流行的预处理算法，以及最新的并行SPAI预处理算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A parallel sparse approximate inverse preconditioning algorithm based on MPI and CUDA

In this study, we present an efficient parallel sparse approximate inverse (SPAI) preconditioning algorithm based on MPI and CUDA, called HybridSPAI. For HybridSPAI, it optimizes a latest static SPAI preconditioning algorithm, and is extended from one GPU to multiple GPUs in order to process large-scale matrices. We make the following significant contributions: (1) a general parallel framework for optimizing the static SPAI preconditioner based on MPI and CUDA is presented, and (2) for each component of the preconditioner, a decision tree is established to choose the optimal kernel of computing it. Experimental results show that HybridSPAI is effective, and outperforms the popular preconditioning algorithms in two public libraries, and a latest parallel SPAI preconditioning algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BenchCouncil Transactions on Benchmarks, Standards and Evaluations

CiteScore

4.80

自引率

0.00%

发文量