基于马氏距离的k-NN的高性能GPU实现

2015 International Symposium on Computer Science and Software Engineering (CSSE) Pub Date : 2015-08-01 DOI:10.1109/CSICSSE.2015.7369240

Mohsen Gavahi, Reza Mirzaei, Abolfazl Nazarbeygi, Armin Ahmadzadeh, S. Gorgin

{"title":"基于马氏距离的k-NN的高性能GPU实现","authors":"Mohsen Gavahi, Reza Mirzaei, Abolfazl Nazarbeygi, Armin Ahmadzadeh, S. Gorgin","doi":"10.1109/CSICSSE.2015.7369240","DOIUrl":null,"url":null,"abstract":"The k-nearest neighbor (k-NN) is a widely used classification technique and has significant applications in various domains. The most challenging issues in the k-nearest neighbor algorithm are high dimensional data, the reasonable accuracy of results and suitable computation time. Nowadays, using parallel processing and deploying many-core platforms like GPUs is considered as one of the popular approaches to improving these issues. In this paper, we present a novel and accurate parallel implementation of k-NN based on Mahalanobis distance metric in GPU platform. We design and implement k-NN for GPU architecture and utilize mathematic and algorithmic techniques to eliminate repetitive computations. Moreover, in addition, to taking advantage of different parallelism techniques, we improve warp management to gain maximum speed up in this implementation. Via Compute Unified Device Architecture (CUDA)-enabled GPUs, the acceleration is considerable as experimental results show the 110X speedup with respect to the single core CPU implementation. Furthermore, we measure the energy and power consumption of this algorithm for both CPU and GPU platforms, where GPU is more energy efficient regarding this application.","PeriodicalId":115653,"journal":{"name":"2015 International Symposium on Computer Science and Software Engineering (CSSE)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"High performance GPU implementation of k-NN based on Mahalanobis distance\",\"authors\":\"Mohsen Gavahi, Reza Mirzaei, Abolfazl Nazarbeygi, Armin Ahmadzadeh, S. Gorgin\",\"doi\":\"10.1109/CSICSSE.2015.7369240\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The k-nearest neighbor (k-NN) is a widely used classification technique and has significant applications in various domains. The most challenging issues in the k-nearest neighbor algorithm are high dimensional data, the reasonable accuracy of results and suitable computation time. Nowadays, using parallel processing and deploying many-core platforms like GPUs is considered as one of the popular approaches to improving these issues. In this paper, we present a novel and accurate parallel implementation of k-NN based on Mahalanobis distance metric in GPU platform. We design and implement k-NN for GPU architecture and utilize mathematic and algorithmic techniques to eliminate repetitive computations. Moreover, in addition, to taking advantage of different parallelism techniques, we improve warp management to gain maximum speed up in this implementation. Via Compute Unified Device Architecture (CUDA)-enabled GPUs, the acceleration is considerable as experimental results show the 110X speedup with respect to the single core CPU implementation. Furthermore, we measure the energy and power consumption of this algorithm for both CPU and GPU platforms, where GPU is more energy efficient regarding this application.\",\"PeriodicalId\":115653,\"journal\":{\"name\":\"2015 International Symposium on Computer Science and Software Engineering (CSSE)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Symposium on Computer Science and Software Engineering (CSSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSICSSE.2015.7369240\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Symposium on Computer Science and Software Engineering (CSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSICSSE.2015.7369240","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

k-最近邻(k-NN)是一种广泛使用的分类技术，在各个领域都有重要的应用。在k近邻算法中最具挑战性的问题是高维数据、合理的结果精度和合适的计算时间。如今，使用并行处理和部署多核平台(如gpu)被认为是改善这些问题的流行方法之一。本文提出了一种在GPU平台上基于Mahalanobis距离度量的新颖、精确的k-NN并行实现。我们为GPU架构设计并实现了k-NN，并利用数学和算法技术来消除重复计算。此外，为了利用不同的并行技术，我们改进了翘曲管理，以在该实现中获得最大的速度提升。通过支持计算统一设备架构(CUDA)的gpu，加速是相当可观的，因为实验结果显示，相对于单核CPU实现，加速了110倍。此外，我们测量了该算法在CPU和GPU平台上的能量和功耗，其中GPU在此应用中更节能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

High performance GPU implementation of k-NN based on Mahalanobis distance

The k-nearest neighbor (k-NN) is a widely used classification technique and has significant applications in various domains. The most challenging issues in the k-nearest neighbor algorithm are high dimensional data, the reasonable accuracy of results and suitable computation time. Nowadays, using parallel processing and deploying many-core platforms like GPUs is considered as one of the popular approaches to improving these issues. In this paper, we present a novel and accurate parallel implementation of k-NN based on Mahalanobis distance metric in GPU platform. We design and implement k-NN for GPU architecture and utilize mathematic and algorithmic techniques to eliminate repetitive computations. Moreover, in addition, to taking advantage of different parallelism techniques, we improve warp management to gain maximum speed up in this implementation. Via Compute Unified Device Architecture (CUDA)-enabled GPUs, the acceleration is considerable as experimental results show the 110X speedup with respect to the single core CPU implementation. Furthermore, we measure the energy and power consumption of this algorithm for both CPU and GPU platforms, where GPU is more energy efficient regarding this application.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 International Symposium on Computer Science and Software Engineering (CSSE)

自引率

0.00%

发文量