On Guided Installation of Basic Linear Algebra Routines in Nodes with Manycore Components

Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores Pub Date : 2016-03-12 DOI:10.1145/2883404.2883422

Luis-Pedro García, J. Cuenca, Francisco-José Herrera, D. Giménez

{"title":"On Guided Installation of Basic Linear Algebra Routines in Nodes with Manycore Components","authors":"Luis-Pedro García, J. Cuenca, Francisco-José Herrera, D. Giménez","doi":"10.1145/2883404.2883422","DOIUrl":null,"url":null,"abstract":"Computational systems are nowadays composed of basic computational components which share multiprocessors and coprocessors of different types, typically several GPUs or MICs. The software previously developed and optimized for simpler systems needs to be redesigned and re-optimized for these new, more complex systems. The adaptation to hybrid multicore+multiGPU and multicore+multiMIC of auto-tuning techniques for basic linear algebra routines is analyzed. The matrix-matrix multiplication kernel, which is optimized for different computational system components through guided experimentation, is studied. The basic matrix-matrix multiplication is, in turn, used inside higher level routines, which delegate their efficient execution to the optimization of the lower level routine. Experimental results are satisfactory in different multicore+multiGPU and multicore+multiMIC systems. So, the guided search of execution configurations for satisfactory execution times proves to be a useful tool for heterogeneous systems, where the complexity of the system means a correct use of highly efficient routines and libraries is difficult.","PeriodicalId":185841,"journal":{"name":"Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2883404.2883422","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Computational systems are nowadays composed of basic computational components which share multiprocessors and coprocessors of different types, typically several GPUs or MICs. The software previously developed and optimized for simpler systems needs to be redesigned and re-optimized for these new, more complex systems. The adaptation to hybrid multicore+multiGPU and multicore+multiMIC of auto-tuning techniques for basic linear algebra routines is analyzed. The matrix-matrix multiplication kernel, which is optimized for different computational system components through guided experimentation, is studied. The basic matrix-matrix multiplication is, in turn, used inside higher level routines, which delegate their efficient execution to the optimization of the lower level routine. Experimental results are satisfactory in different multicore+multiGPU and multicore+multiMIC systems. So, the guided search of execution configurations for satisfactory execution times proves to be a useful tool for heterogeneous systems, where the complexity of the system means a correct use of highly efficient routines and libraries is difficult.

查看原文本刊更多论文

多核节点中基本线性代数例程的引导安装

如今的计算系统由基本的计算组件组成，这些组件共享不同类型的多处理器和协处理器，通常是几个gpu或mic。以前为简单系统开发和优化的软件需要重新设计和重新优化，以适应这些新的、更复杂的系统。分析了线性代数基本例程自调优技术对混合多核+多gpu和多核+多集成电路的适应性。通过引导实验，研究了针对不同计算系统组件优化的矩阵-矩阵乘法核。基本的矩阵-矩阵乘法依次在高级例程中使用，这些例程将其有效执行委托给低级例程的优化。在不同的多核+多gpu和多核+多ic系统上的实验结果令人满意。因此，对执行配置进行引导搜索以获得满意的执行时间被证明是异构系统的一个有用工具，在异构系统中，系统的复杂性意味着很难正确使用高效的例程和库。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores

自引率

0.00%

发文量