{"title":"使用CUDA和OpenMP加速数据挖掘","authors":"Adwa S. Al-Hamoudi, A. Biyabani","doi":"10.1109/AICCSA.2014.7073244","DOIUrl":null,"url":null,"abstract":"The widespread availability of multi-core processors and specialized co-processors has generally not been matched by the actual use of parallel software by users. In this work we experimentally verify the simplicity of code parallelization by implementing two kinds of data mining algorithms on two parallel platforms with a view to building upon them in future projects. We use CUDA on a graphics card with 384 CUDA cores and OpenMP on a dual-core machine and record their performance versus the sequential base case with C++ code running on a single processor. We report modest speedups with OpenMP and significant speedups with CUDA as expected. We also observed underutilization of cores implying that results may be improved if the base code is further optimized.","PeriodicalId":412749,"journal":{"name":"2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Accelerating data mining with CUDA and OpenMP\",\"authors\":\"Adwa S. Al-Hamoudi, A. Biyabani\",\"doi\":\"10.1109/AICCSA.2014.7073244\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The widespread availability of multi-core processors and specialized co-processors has generally not been matched by the actual use of parallel software by users. In this work we experimentally verify the simplicity of code parallelization by implementing two kinds of data mining algorithms on two parallel platforms with a view to building upon them in future projects. We use CUDA on a graphics card with 384 CUDA cores and OpenMP on a dual-core machine and record their performance versus the sequential base case with C++ code running on a single processor. We report modest speedups with OpenMP and significant speedups with CUDA as expected. We also observed underutilization of cores implying that results may be improved if the base code is further optimized.\",\"PeriodicalId\":412749,\"journal\":{\"name\":\"2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AICCSA.2014.7073244\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICCSA.2014.7073244","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The widespread availability of multi-core processors and specialized co-processors has generally not been matched by the actual use of parallel software by users. In this work we experimentally verify the simplicity of code parallelization by implementing two kinds of data mining algorithms on two parallel platforms with a view to building upon them in future projects. We use CUDA on a graphics card with 384 CUDA cores and OpenMP on a dual-core machine and record their performance versus the sequential base case with C++ code running on a single processor. We report modest speedups with OpenMP and significant speedups with CUDA as expected. We also observed underutilization of cores implying that results may be improved if the base code is further optimized.