Md. Salim, Ali O. Akkirman, Mert Hidayetoglu, L. Gurel
{"title":"比较基准测试:多核协处理器和GPU上的矩阵乘法","authors":"Md. Salim, Ali O. Akkirman, Mert Hidayetoglu, L. Gurel","doi":"10.1109/CEM.2015.7237429","DOIUrl":null,"url":null,"abstract":"This paper reports the performances of an Intel Xeon Phi coprocessor and an Nvidia Tesla GPU for multiplication of large matrices. For this purpose, various libraries, such as Intel MKL and MAGMA, are employed with different execution modes of the coprocessor. We compare the performances of the coprocessor and the GPU in terms of running time, memory requirement, and programming difficulty for the special case of matrix-matrix multiplication.","PeriodicalId":409699,"journal":{"name":"2015 Computational Electromagnetics International Workshop (CEM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Comparative benchmarking: matrix multiplication on a multicore coprocessor and a GPU\",\"authors\":\"Md. Salim, Ali O. Akkirman, Mert Hidayetoglu, L. Gurel\",\"doi\":\"10.1109/CEM.2015.7237429\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper reports the performances of an Intel Xeon Phi coprocessor and an Nvidia Tesla GPU for multiplication of large matrices. For this purpose, various libraries, such as Intel MKL and MAGMA, are employed with different execution modes of the coprocessor. We compare the performances of the coprocessor and the GPU in terms of running time, memory requirement, and programming difficulty for the special case of matrix-matrix multiplication.\",\"PeriodicalId\":409699,\"journal\":{\"name\":\"2015 Computational Electromagnetics International Workshop (CEM)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 Computational Electromagnetics International Workshop (CEM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CEM.2015.7237429\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Computational Electromagnetics International Workshop (CEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEM.2015.7237429","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
摘要
本文报道了Intel Xeon Phi协处理器和Nvidia Tesla GPU处理大矩阵乘法的性能。为此,各种库(如Intel MKL和MAGMA)被用于协处理器的不同执行模式。我们比较了协处理器和GPU在运行时间、内存需求和矩阵-矩阵乘法特殊情况下的编程难度方面的性能。
Comparative benchmarking: matrix multiplication on a multicore coprocessor and a GPU
This paper reports the performances of an Intel Xeon Phi coprocessor and an Nvidia Tesla GPU for multiplication of large matrices. For this purpose, various libraries, such as Intel MKL and MAGMA, are employed with different execution modes of the coprocessor. We compare the performances of the coprocessor and the GPU in terms of running time, memory requirement, and programming difficulty for the special case of matrix-matrix multiplication.