R. Mahfoudhi, Sami Achour, O. Hamdi-Larbi, Z. Mahjoub
{"title":"High Performance Recursive Matrix Inversion for Multicore Architectures","authors":"R. Mahfoudhi, Sami Achour, O. Hamdi-Larbi, Z. Mahjoub","doi":"10.1109/HPCS.2017.104","DOIUrl":null,"url":null,"abstract":"There are several approaches for computing the inverse of a dense square matrix, say A, namely Gaussian elimination, block wise inversion, and LU factorization (LUF). The latter is used in mathematical software libraries such as SCALAPACK, PBLAS and MATLAB. The inversion routine in SCALAPACK library (called PDGETRI) consists, once the two factors L and U are known (where ALU), in first inverting U (PDGETRF) then solving a triangular matrix system giving A−1. A symmetric way consists in first inverting L, then solving a matrix system giving A−1. Alternatively, one could compute the inverses of both U and L, then their product and get A−1. On the other hand, the Strassen fast matrix inversion algorithm is known as an efficient alternative for solving our problem. We propose in this paper a series of different versions for parallel dense matrix inversion based on the 'Divide and Conquer' paradigm. A theoretical performance study permits to establish an accurate comparison between the designed algorithms. We achieved a series of experiments that permit to validate the contribution and lead to efficient performances obtained for large matrix sizes i.e. up to 40% faster than SCALAPACK.","PeriodicalId":115758,"journal":{"name":"2017 International Conference on High Performance Computing & Simulation (HPCS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS.2017.104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
There are several approaches for computing the inverse of a dense square matrix, say A, namely Gaussian elimination, block wise inversion, and LU factorization (LUF). The latter is used in mathematical software libraries such as SCALAPACK, PBLAS and MATLAB. The inversion routine in SCALAPACK library (called PDGETRI) consists, once the two factors L and U are known (where ALU), in first inverting U (PDGETRF) then solving a triangular matrix system giving A−1. A symmetric way consists in first inverting L, then solving a matrix system giving A−1. Alternatively, one could compute the inverses of both U and L, then their product and get A−1. On the other hand, the Strassen fast matrix inversion algorithm is known as an efficient alternative for solving our problem. We propose in this paper a series of different versions for parallel dense matrix inversion based on the 'Divide and Conquer' paradigm. A theoretical performance study permits to establish an accurate comparison between the designed algorithms. We achieved a series of experiments that permit to validate the contribution and lead to efficient performances obtained for large matrix sizes i.e. up to 40% faster than SCALAPACK.