Determining The Optimal Number of K-Means Clusters Using The Calinski Harabasz Index and Krzanowski and Lai Index Methods for Groupsing Flood Prone Areas In North Sumatra
Ziana Syahputri, S. Sutarman, Machrani Adi Putri Siregar
{"title":"Determining The Optimal Number of K-Means Clusters Using The Calinski Harabasz Index and Krzanowski and Lai Index Methods for Groupsing Flood Prone Areas In North Sumatra","authors":"Ziana Syahputri, S. Sutarman, Machrani Adi Putri Siregar","doi":"10.33395/sinkron.v9i1.13246","DOIUrl":null,"url":null,"abstract":"The k-means algorithm is a partitional clustering method. K-Means has several advantages, including being easy to implement, having a high level of convergence and producing denser clusters. Meanwhile, the drawback is that it is difficult to determine the optimal number of clusters. The K-Means method will be used to solve problems in areas prone to flood disasters in North Sumatra. This research aims to find the optimal number of clusters with the Calinski Harabasz Index and Krzanowski And Lai Index based on the Cluster Tightness Measure (CTM) value. There are eleven variables used in this research. Based on the research results, it was concluded that the CTM CH result of 0.376 was smaller than the CTM KL of 0.7843. So it can be said that determining the optimal number of clusters using CH with k = 6 is better than KL with k = 2.","PeriodicalId":34046,"journal":{"name":"Sinkron","volume":"8 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sinkron","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33395/sinkron.v9i1.13246","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The k-means algorithm is a partitional clustering method. K-Means has several advantages, including being easy to implement, having a high level of convergence and producing denser clusters. Meanwhile, the drawback is that it is difficult to determine the optimal number of clusters. The K-Means method will be used to solve problems in areas prone to flood disasters in North Sumatra. This research aims to find the optimal number of clusters with the Calinski Harabasz Index and Krzanowski And Lai Index based on the Cluster Tightness Measure (CTM) value. There are eleven variables used in this research. Based on the research results, it was concluded that the CTM CH result of 0.376 was smaller than the CTM KL of 0.7843. So it can be said that determining the optimal number of clusters using CH with k = 6 is better than KL with k = 2.