k-Means聚类算法的初始聚类中心优化及特征自动加权

2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE) Pub Date : 2022-08-01 DOI:10.1109/mlise57402.2022.00036

Fu-zhou Zhao

{"title":"k-Means聚类算法的初始聚类中心优化及特征自动加权","authors":"Fu-zhou Zhao","doi":"10.1109/mlise57402.2022.00036","DOIUrl":null,"url":null,"abstract":"We focus on two main issues. First, the effectiveness of clustering is strongly related to selecting the initial clustering center. Traditional algorithms and their tendency to select multiple initial clustering centers in the same cluster, we use the maximum distance principle, which ensures the initial clustering centers attribute to different categories to avoid this problem. Second, the k-means algorithm cannot assign greater weights to essential features in high dimensions because it treats all features equitably in the clustering process. We acquire a proposed algorithm that is more efficient and accurate than the traditional k-means by improving the algorithm with the multidimensional feature weights technique to give more weight to the more essential features. Experimentally, our enhancements have significantly improved efficiency by 33% and accuracy by 36%.","PeriodicalId":350291,"journal":{"name":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Initial clustering center optimization and feature auto-weighting for k-Means clustering algorithm\",\"authors\":\"Fu-zhou Zhao\",\"doi\":\"10.1109/mlise57402.2022.00036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We focus on two main issues. First, the effectiveness of clustering is strongly related to selecting the initial clustering center. Traditional algorithms and their tendency to select multiple initial clustering centers in the same cluster, we use the maximum distance principle, which ensures the initial clustering centers attribute to different categories to avoid this problem. Second, the k-means algorithm cannot assign greater weights to essential features in high dimensions because it treats all features equitably in the clustering process. We acquire a proposed algorithm that is more efficient and accurate than the traditional k-means by improving the algorithm with the multidimensional feature weights technique to give more weight to the more essential features. Experimentally, our enhancements have significantly improved efficiency by 33% and accuracy by 36%.\",\"PeriodicalId\":350291,\"journal\":{\"name\":\"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/mlise57402.2022.00036\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/mlise57402.2022.00036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

我们主要关注两个问题。首先，聚类的有效性与初始聚类中心的选择密切相关。传统算法倾向于在同一聚类中选择多个初始聚类中心，我们采用最大距离原则，保证初始聚类中心属于不同的类别，从而避免了这一问题。其次，由于k-means算法在聚类过程中公平对待所有特征，因此无法为高维的基本特征分配更大的权重。我们提出了一种比传统的k-means算法更高效和准确的算法，通过使用多维特征权重技术对算法进行改进，赋予更重要的特征更多的权重。在实验中，我们的改进显著提高了33%的效率和36%的精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Initial clustering center optimization and feature auto-weighting for k-Means clustering algorithm

We focus on two main issues. First, the effectiveness of clustering is strongly related to selecting the initial clustering center. Traditional algorithms and their tendency to select multiple initial clustering centers in the same cluster, we use the maximum distance principle, which ensures the initial clustering centers attribute to different categories to avoid this problem. Second, the k-means algorithm cannot assign greater weights to essential features in high dimensions because it treats all features equitably in the clustering process. We acquire a proposed algorithm that is more efficient and accurate than the traditional k-means by improving the algorithm with the multidimensional feature weights technique to give more weight to the more essential features. Experimentally, our enhancements have significantly improved efficiency by 33% and accuracy by 36%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)

自引率

0.00%

发文量