{"title":"基于gpu的图像分割k均值聚类算法并行实现","authors":"Shruti Karbhari, Shadi G. Alawneh","doi":"10.1109/EIT.2018.8500282","DOIUrl":null,"url":null,"abstract":"Clustering algorithms group a dataset into clusters that have common features. Clustering has applications in computer vision, data mining, market segmentation etc. The k-means clustering algorithm is one of the most popular algorithms where the mean is used as a prototype of the cluster. In this paper, we explore accelerating the performance of k-means clustering using NVIDIA Graphics Processing Units (GPUs) programmed with CUDA C. Different optimization techniques are applied such as the use of shared memory for image data and the use of constant memory for cluster data. The performance results are evaluated on a range of images from small ($256\\times 256$ pixels) to large ($1024\\times 1024$ pixels) and number of clusters range from 4 to 256. We find that on an average, the parallel implementation has a 9x speed up as compared to the sequential version for 4 clusters. The speedup increases to 57x as number of clusters increase to 256. This implementation also performs better than a reference implementation from Northwestern University/UC Berkeley.","PeriodicalId":188414,"journal":{"name":"2018 IEEE International Conference on Electro/Information Technology (EIT)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"GPU-Based Parallel Implementation of K-Means Clustering Algorithm for Image Segmentation\",\"authors\":\"Shruti Karbhari, Shadi G. Alawneh\",\"doi\":\"10.1109/EIT.2018.8500282\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering algorithms group a dataset into clusters that have common features. Clustering has applications in computer vision, data mining, market segmentation etc. The k-means clustering algorithm is one of the most popular algorithms where the mean is used as a prototype of the cluster. In this paper, we explore accelerating the performance of k-means clustering using NVIDIA Graphics Processing Units (GPUs) programmed with CUDA C. Different optimization techniques are applied such as the use of shared memory for image data and the use of constant memory for cluster data. The performance results are evaluated on a range of images from small ($256\\\\times 256$ pixels) to large ($1024\\\\times 1024$ pixels) and number of clusters range from 4 to 256. We find that on an average, the parallel implementation has a 9x speed up as compared to the sequential version for 4 clusters. The speedup increases to 57x as number of clusters increase to 256. This implementation also performs better than a reference implementation from Northwestern University/UC Berkeley.\",\"PeriodicalId\":188414,\"journal\":{\"name\":\"2018 IEEE International Conference on Electro/Information Technology (EIT)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference on Electro/Information Technology (EIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EIT.2018.8500282\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Electro/Information Technology (EIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EIT.2018.8500282","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
GPU-Based Parallel Implementation of K-Means Clustering Algorithm for Image Segmentation
Clustering algorithms group a dataset into clusters that have common features. Clustering has applications in computer vision, data mining, market segmentation etc. The k-means clustering algorithm is one of the most popular algorithms where the mean is used as a prototype of the cluster. In this paper, we explore accelerating the performance of k-means clustering using NVIDIA Graphics Processing Units (GPUs) programmed with CUDA C. Different optimization techniques are applied such as the use of shared memory for image data and the use of constant memory for cluster data. The performance results are evaluated on a range of images from small ($256\times 256$ pixels) to large ($1024\times 1024$ pixels) and number of clusters range from 4 to 256. We find that on an average, the parallel implementation has a 9x speed up as compared to the sequential version for 4 clusters. The speedup increases to 57x as number of clusters increase to 256. This implementation also performs better than a reference implementation from Northwestern University/UC Berkeley.