X. Lei, Tongxiang Gu, S. Graillat, Xiaowen Xu, Jing Meng
{"title":"Comparison of Reproducible Parallel Preconditioned BiCGSTAB Algorithm Based on ExBLAS and ReproBLAS","authors":"X. Lei, Tongxiang Gu, S. Graillat, Xiaowen Xu, Jing Meng","doi":"10.1145/3578178.3578234","DOIUrl":null,"url":null,"abstract":"Krylov subspace algorithms are important methods for solving linear systems. In order to efficiently solve large-scale linear systems, parallelism techniques are often applied. However, parallelism often enlarge the non-associativity of floating-point operations, which can lead to non-reproducibility of the computations. This paper compares the performance of the parallel preconditioned BiCGSTAB algorithm implemented with two different libraries (ExBLAS and ReproBLAS) that can ensure the reproducibility of computations. To address the effect of the compiler, we explicitly utilize the FMA instructions. Finally, numerical experiments show that based on two BLAS implementations, the BiCGSTAB algorithms are reproducible. By contrast, the BiCGSTAB algorithm based on ExBLAS is more accurate but more time-consuming than the one based on ReproBLAS.","PeriodicalId":314778,"journal":{"name":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region","volume":"143 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3578178.3578234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Krylov subspace algorithms are important methods for solving linear systems. In order to efficiently solve large-scale linear systems, parallelism techniques are often applied. However, parallelism often enlarge the non-associativity of floating-point operations, which can lead to non-reproducibility of the computations. This paper compares the performance of the parallel preconditioned BiCGSTAB algorithm implemented with two different libraries (ExBLAS and ReproBLAS) that can ensure the reproducibility of computations. To address the effect of the compiler, we explicitly utilize the FMA instructions. Finally, numerical experiments show that based on two BLAS implementations, the BiCGSTAB algorithms are reproducible. By contrast, the BiCGSTAB algorithm based on ExBLAS is more accurate but more time-consuming than the one based on ReproBLAS.