COMPARATIVE STUDY OF MUTATION OPERATORS IN THE GENETIC ALGORITHMS FOR THE K-MEANS PROBLEM

IF 0.5 Q3 MATHEMATICS
Ri-Zhi Li, L. Kazakovtsev
{"title":"COMPARATIVE STUDY OF MUTATION OPERATORS IN THE GENETIC ALGORITHMS FOR THE K-MEANS PROBLEM","authors":"Ri-Zhi Li, L. Kazakovtsev","doi":"10.22190/FUMI2004091L","DOIUrl":null,"url":null,"abstract":"The k-means problem and the algorithm of the same name are the most commonly used clustering model and algorithm. Being a local search optimization method, the k-means algorithm falls to a local minimum of the objective function (sum of squared errors) and depends on the initial solution which is given or selected randomly. This disadvantage of the algorithm can be avoided by combining this algorithm with more sophisticated methods such as the Variable Neighborhood Search, agglomerative or dissociative heuristic approaches, the genetic algorithms, etc. Aiming at the shortcomings of the k-means algorithm and combining the advantages of the k-means algorithm and rvolutionary approack, a genetic clustering algorithm with the cross-mutation operator was designed. The efficiency of the genetic algorithms with the tournament selection, one-point crossover and various mutation operators (without any mutation operator, with the uniform mutation, DBM mutation and new cross-mutation) are compared on the data sets up to 2 millions of data vectors. We used data from the UCI repository and special data set collected during the testing of the highly reliable semiconductor components. In this paper, we do not discuss the comparative efficiency of the genetic algorithms for the k-means problem in comparison with the other (non-genetic) algorithms as well as the comparative adequacy of the k-means clustering model. Here, we focus on the influence of various mutation operators on the efficiency of the genetic algorithms only.","PeriodicalId":54148,"journal":{"name":"Facta Universitatis-Series Mathematics and Informatics","volume":null,"pages":null},"PeriodicalIF":0.5000,"publicationDate":"2021-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Facta Universitatis-Series Mathematics and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22190/FUMI2004091L","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

The k-means problem and the algorithm of the same name are the most commonly used clustering model and algorithm. Being a local search optimization method, the k-means algorithm falls to a local minimum of the objective function (sum of squared errors) and depends on the initial solution which is given or selected randomly. This disadvantage of the algorithm can be avoided by combining this algorithm with more sophisticated methods such as the Variable Neighborhood Search, agglomerative or dissociative heuristic approaches, the genetic algorithms, etc. Aiming at the shortcomings of the k-means algorithm and combining the advantages of the k-means algorithm and rvolutionary approack, a genetic clustering algorithm with the cross-mutation operator was designed. The efficiency of the genetic algorithms with the tournament selection, one-point crossover and various mutation operators (without any mutation operator, with the uniform mutation, DBM mutation and new cross-mutation) are compared on the data sets up to 2 millions of data vectors. We used data from the UCI repository and special data set collected during the testing of the highly reliable semiconductor components. In this paper, we do not discuss the comparative efficiency of the genetic algorithms for the k-means problem in comparison with the other (non-genetic) algorithms as well as the comparative adequacy of the k-means clustering model. Here, we focus on the influence of various mutation operators on the efficiency of the genetic algorithms only.
k -均值问题遗传算法中变异算子的比较研究
k-means问题和同名算法是最常用的聚类模型和算法。k-means算法是一种局部搜索优化方法,它落在目标函数(误差平方和)的局部最小值上,依赖于随机给定或选择的初始解。通过将该算法与更复杂的方法(如可变邻域搜索、聚集或解离启发式方法、遗传算法等)相结合,可以避免该算法的这一缺点。针对k-means算法的不足,结合k-means算法和进化算法的优点,设计了一种带有交叉变异算子的遗传聚类算法。在多达200万个数据向量的数据集上,比较了竞赛选择、一点交叉和各种变异算子(无变异算子、均匀变异、DBM变异和新交叉变异)的遗传算法的效率。我们使用了来自UCI存储库的数据和在高可靠性半导体组件测试期间收集的特殊数据集。在本文中,我们没有讨论k-means问题的遗传算法与其他(非遗传)算法的比较效率,也没有讨论k-means聚类模型的比较充分性。这里,我们只关注各种变异算子对遗传算法效率的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
16
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信