J. Cuenca, Luis-Pedro García, D. Giménez, J. Dongarra
{"title":"Processes Distribution of Homogeneous Parallel Linear Algebra Routines on Heterogeneous Clusters","authors":"J. Cuenca, Luis-Pedro García, D. Giménez, J. Dongarra","doi":"10.1109/CLUSTR.2005.347021","DOIUrl":null,"url":null,"abstract":"This paper presents a self-optimization methodology for parallel linear algebra routines on heterogeneous systems. For each routine, a series of decisions is taken automatically in order to obtain an execution time close to the optimum (without rewriting the routine's code). Some of these decisions are: the number of processes to generate, the heterogeneous distribution of these processes over the network of processors, the logical topology of the generated processes,... To reduce the search space of such decisions, different heuristics have been used. The experiments have been performed with a parallel LU factorization routine similar to the ScaLAPACK one, and good results have been obtained on different heterogeneous platforms","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTR.2005.347021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
This paper presents a self-optimization methodology for parallel linear algebra routines on heterogeneous systems. For each routine, a series of decisions is taken automatically in order to obtain an execution time close to the optimum (without rewriting the routine's code). Some of these decisions are: the number of processes to generate, the heterogeneous distribution of these processes over the network of processors, the logical topology of the generated processes,... To reduce the search space of such decisions, different heuristics have been used. The experiments have been performed with a parallel LU factorization routine similar to the ScaLAPACK one, and good results have been obtained on different heterogeneous platforms