{"title":"Towards the design of an automatically tuned linear algebra library","authors":"J. Cuenca, D. Giménez, José González","doi":"10.1109/EMPDP.2002.994270","DOIUrl":null,"url":null,"abstract":"In this work we propose the architecture of an automatically tuned linear algebra library, which is composed by a set of linear algebra routines along with their installation routines. During the installation process on a system, the linear algebra routines will be tuned automatically to the system conditions: hardware characteristics and basic libraries used in the linear algebra routines. The design methodology is analysed with a block LU factorisation. Variants for a sequential and parallel version of this, routine on a logical rectangular mesh of processors are, considered. An analytical model of the algorithm is developed as the basis of our methodology, and the behaviour of the algorithm is analysed with message-passing using MPI on several platforms: Network of SUN workstations, SGI Origin 2000 and IBM SP2, and with, different basic linear algebra libraries: reference BLAS, machine-specific BLAS and ATLAS. The experiments show that it is possible to make a good automatic choice of configurable parameters of the linear algebra routines during the installation process. The average execution time of the linear algebra routine is reduced by about 15% with respect to the non-tuned version.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EMPDP.2002.994270","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
In this work we propose the architecture of an automatically tuned linear algebra library, which is composed by a set of linear algebra routines along with their installation routines. During the installation process on a system, the linear algebra routines will be tuned automatically to the system conditions: hardware characteristics and basic libraries used in the linear algebra routines. The design methodology is analysed with a block LU factorisation. Variants for a sequential and parallel version of this, routine on a logical rectangular mesh of processors are, considered. An analytical model of the algorithm is developed as the basis of our methodology, and the behaviour of the algorithm is analysed with message-passing using MPI on several platforms: Network of SUN workstations, SGI Origin 2000 and IBM SP2, and with, different basic linear algebra libraries: reference BLAS, machine-specific BLAS and ATLAS. The experiments show that it is possible to make a good automatic choice of configurable parameters of the linear algebra routines during the installation process. The average execution time of the linear algebra routine is reduced by about 15% with respect to the non-tuned version.