{"title":"Improved K-Means Algorithm on Home Industry Data Clustering in the Province of Bangka Belitung","authors":"Hadi Santoso, Hilyah Magdalena","doi":"10.1109/ICoSTA48221.2020.1570598913","DOIUrl":null,"url":null,"abstract":"The Government of Bangka Belitung Islands Province has not classified the home industry until now. Based on these problems, we propose a k-means algorithm for clustering home industry data. The k-means algorithm is widely used because it is straightforward and very suitable for grouping data. However, in its application, the k-means algorithm has a weakness in determining the starting point of the cluster center and, in its selection, is still carried out randomly. As a result, if the random value for initializing the initial centroid value is not right, then the grouping is less than optimal. Internal cluster validation is one way to determine the optimal cluster without knowing prior information from the data. This study aims to identify the optimal group by making improvements to the k-means algorithm and then to test it by applying an internal cluster, namely the Davies-Bouldin Index (DBI) and the Silhouette Index (SI) on the data of home industry in Bangka Belitung Island Province. The optimal cluster calculation results based on internal cluster validation both show that the Silhouette index and the DBI index with k = 3 on improved k-means algorithm. While the traditional k-means algorithm of internal cluster validation both show that the Silhouette index and the Davies-Bouldin Index with k = 2. The conclusion is k = 3 on the Davies-Bouldin Index of this research data gives good results for clustering home industry data in Bangka Belitung Islands Province.","PeriodicalId":375166,"journal":{"name":"2020 International Conference on Smart Technology and Applications (ICoSTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Smart Technology and Applications (ICoSTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICoSTA48221.2020.1570598913","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The Government of Bangka Belitung Islands Province has not classified the home industry until now. Based on these problems, we propose a k-means algorithm for clustering home industry data. The k-means algorithm is widely used because it is straightforward and very suitable for grouping data. However, in its application, the k-means algorithm has a weakness in determining the starting point of the cluster center and, in its selection, is still carried out randomly. As a result, if the random value for initializing the initial centroid value is not right, then the grouping is less than optimal. Internal cluster validation is one way to determine the optimal cluster without knowing prior information from the data. This study aims to identify the optimal group by making improvements to the k-means algorithm and then to test it by applying an internal cluster, namely the Davies-Bouldin Index (DBI) and the Silhouette Index (SI) on the data of home industry in Bangka Belitung Island Province. The optimal cluster calculation results based on internal cluster validation both show that the Silhouette index and the DBI index with k = 3 on improved k-means algorithm. While the traditional k-means algorithm of internal cluster validation both show that the Silhouette index and the Davies-Bouldin Index with k = 2. The conclusion is k = 3 on the Davies-Bouldin Index of this research data gives good results for clustering home industry data in Bangka Belitung Islands Province.