{"title":"KNN算法的高性能GPU实现:综述","authors":"Pooja Bidye, Pradnya Borkar, Nitin Rakesh","doi":"10.1016/j.mex.2025.103633","DOIUrl":null,"url":null,"abstract":"<div><div>With large volumes of complex data generated by different applications, Machine Learning (ML) algorithms alone may not yield significant performance benefits on a single or multi-core CPU. Applying optimization techniques to these ML algorithms in a High-Performance Computing (HPC) environment can give considerable speedups for high-dimensional datasets. One of the most widely used classification algorithms, with applications in various domains, is the K-Nearest Neighbor (KNN). Despite its simplicity, KNN poses several challenges while handling high-dimensional data. However, the algorithm’s inherent nature presents an opportunity for parallelization. This paper reviews the optimization techniques employed by several researchers to accelerate the KNN algorithm on a GPU platform. The study reveals that techniques such as coalesced-memory access, tiling with shared memory, chunking, data segmentation, and pivot-based partitioning significantly contribute towards speeding up the KNN algorithm to leverage the GPU capabilities. The algorithms reviewed have performed exceptionally well on high-dimensional data with speedups up to 750x for a dual-GPU platform and up to 1840x for a multi-GPU platform. This study serves as a valuable resource for researchers examining KNN acceleration in high-performance computing environments and its applications in various fields.</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103633"},"PeriodicalIF":1.9000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High performance GPU implementation of KNN algorithm: A review\",\"authors\":\"Pooja Bidye, Pradnya Borkar, Nitin Rakesh\",\"doi\":\"10.1016/j.mex.2025.103633\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With large volumes of complex data generated by different applications, Machine Learning (ML) algorithms alone may not yield significant performance benefits on a single or multi-core CPU. Applying optimization techniques to these ML algorithms in a High-Performance Computing (HPC) environment can give considerable speedups for high-dimensional datasets. One of the most widely used classification algorithms, with applications in various domains, is the K-Nearest Neighbor (KNN). Despite its simplicity, KNN poses several challenges while handling high-dimensional data. However, the algorithm’s inherent nature presents an opportunity for parallelization. This paper reviews the optimization techniques employed by several researchers to accelerate the KNN algorithm on a GPU platform. The study reveals that techniques such as coalesced-memory access, tiling with shared memory, chunking, data segmentation, and pivot-based partitioning significantly contribute towards speeding up the KNN algorithm to leverage the GPU capabilities. The algorithms reviewed have performed exceptionally well on high-dimensional data with speedups up to 750x for a dual-GPU platform and up to 1840x for a multi-GPU platform. This study serves as a valuable resource for researchers examining KNN acceleration in high-performance computing environments and its applications in various fields.</div></div>\",\"PeriodicalId\":18446,\"journal\":{\"name\":\"MethodsX\",\"volume\":\"15 \",\"pages\":\"Article 103633\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MethodsX\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2215016125004777\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MethodsX","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2215016125004777","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
High performance GPU implementation of KNN algorithm: A review
With large volumes of complex data generated by different applications, Machine Learning (ML) algorithms alone may not yield significant performance benefits on a single or multi-core CPU. Applying optimization techniques to these ML algorithms in a High-Performance Computing (HPC) environment can give considerable speedups for high-dimensional datasets. One of the most widely used classification algorithms, with applications in various domains, is the K-Nearest Neighbor (KNN). Despite its simplicity, KNN poses several challenges while handling high-dimensional data. However, the algorithm’s inherent nature presents an opportunity for parallelization. This paper reviews the optimization techniques employed by several researchers to accelerate the KNN algorithm on a GPU platform. The study reveals that techniques such as coalesced-memory access, tiling with shared memory, chunking, data segmentation, and pivot-based partitioning significantly contribute towards speeding up the KNN algorithm to leverage the GPU capabilities. The algorithms reviewed have performed exceptionally well on high-dimensional data with speedups up to 750x for a dual-GPU platform and up to 1840x for a multi-GPU platform. This study serves as a valuable resource for researchers examining KNN acceleration in high-performance computing environments and its applications in various fields.