{"title":"Initial clustering center optimization and feature auto-weighting for k-Means clustering algorithm","authors":"Fu-zhou Zhao","doi":"10.1109/mlise57402.2022.00036","DOIUrl":null,"url":null,"abstract":"We focus on two main issues. First, the effectiveness of clustering is strongly related to selecting the initial clustering center. Traditional algorithms and their tendency to select multiple initial clustering centers in the same cluster, we use the maximum distance principle, which ensures the initial clustering centers attribute to different categories to avoid this problem. Second, the k-means algorithm cannot assign greater weights to essential features in high dimensions because it treats all features equitably in the clustering process. We acquire a proposed algorithm that is more efficient and accurate than the traditional k-means by improving the algorithm with the multidimensional feature weights technique to give more weight to the more essential features. Experimentally, our enhancements have significantly improved efficiency by 33% and accuracy by 36%.","PeriodicalId":350291,"journal":{"name":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/mlise57402.2022.00036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We focus on two main issues. First, the effectiveness of clustering is strongly related to selecting the initial clustering center. Traditional algorithms and their tendency to select multiple initial clustering centers in the same cluster, we use the maximum distance principle, which ensures the initial clustering centers attribute to different categories to avoid this problem. Second, the k-means algorithm cannot assign greater weights to essential features in high dimensions because it treats all features equitably in the clustering process. We acquire a proposed algorithm that is more efficient and accurate than the traditional k-means by improving the algorithm with the multidimensional feature weights technique to give more weight to the more essential features. Experimentally, our enhancements have significantly improved efficiency by 33% and accuracy by 36%.