{"title":"客户细分模糊c均值算法的距离度量比较","authors":"Uus Rusdiana, Iin Ernawati, Noor Falih, A. Arista","doi":"10.1109/ICIMCIS53775.2021.9699206","DOIUrl":null,"url":null,"abstract":"Distance metrics are often used in a similarity-based algorithm like clustering to improve the performance when deciding to group data based on similarities. It has a crucial role when building machine learning models. Therefore, this research would like to examine the optimal distance metrics method in the clustering algorithm. The algorithm that will be used in this research is Fuzzy C-Means clustering by applying several data distance measurement methods (Euclidean Distance, Manhattan Distance, Chebyshev Distance, and Minkowski Distance). Then, the resulting cluster will be evaluated using a validity index including partition coefficient index (PC), modified partition coefficient index (MPC), and RMSE. The results represent that the most optimal distance of the 2 clusters dataset was obtained using Manhattan Distance measurement methods. The most optimal distance of the 3 clusters dataset was obtained using Minkowski Distance measurement methods. From a series of conducted experiments of the dataset, the Manhattan and Minkowski measurement methods represented the optimal results for the FCM algorithm.","PeriodicalId":250460,"journal":{"name":"2021 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Comparison of Distance Metrics on Fuzzy C-Means Algorithm Through Customer Segmentation\",\"authors\":\"Uus Rusdiana, Iin Ernawati, Noor Falih, A. Arista\",\"doi\":\"10.1109/ICIMCIS53775.2021.9699206\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Distance metrics are often used in a similarity-based algorithm like clustering to improve the performance when deciding to group data based on similarities. It has a crucial role when building machine learning models. Therefore, this research would like to examine the optimal distance metrics method in the clustering algorithm. The algorithm that will be used in this research is Fuzzy C-Means clustering by applying several data distance measurement methods (Euclidean Distance, Manhattan Distance, Chebyshev Distance, and Minkowski Distance). Then, the resulting cluster will be evaluated using a validity index including partition coefficient index (PC), modified partition coefficient index (MPC), and RMSE. The results represent that the most optimal distance of the 2 clusters dataset was obtained using Manhattan Distance measurement methods. The most optimal distance of the 3 clusters dataset was obtained using Minkowski Distance measurement methods. From a series of conducted experiments of the dataset, the Manhattan and Minkowski measurement methods represented the optimal results for the FCM algorithm.\",\"PeriodicalId\":250460,\"journal\":{\"name\":\"2021 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIMCIS53775.2021.9699206\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIMCIS53775.2021.9699206","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparison of Distance Metrics on Fuzzy C-Means Algorithm Through Customer Segmentation
Distance metrics are often used in a similarity-based algorithm like clustering to improve the performance when deciding to group data based on similarities. It has a crucial role when building machine learning models. Therefore, this research would like to examine the optimal distance metrics method in the clustering algorithm. The algorithm that will be used in this research is Fuzzy C-Means clustering by applying several data distance measurement methods (Euclidean Distance, Manhattan Distance, Chebyshev Distance, and Minkowski Distance). Then, the resulting cluster will be evaluated using a validity index including partition coefficient index (PC), modified partition coefficient index (MPC), and RMSE. The results represent that the most optimal distance of the 2 clusters dataset was obtained using Manhattan Distance measurement methods. The most optimal distance of the 3 clusters dataset was obtained using Minkowski Distance measurement methods. From a series of conducted experiments of the dataset, the Manhattan and Minkowski measurement methods represented the optimal results for the FCM algorithm.