利用元搜索技术改进分类算法

{"title":"利用元搜索技术改进分类算法","authors":"","doi":"10.61453/jods.v2024no22","DOIUrl":null,"url":null,"abstract":"Classification is a process of grouping or placing data into appropriate categories or classes based on specificattributes or features to predict labels or classes of new data based on patternsobserved from previously trained data. Implementing this process uses classification algorithms such asNaïve Bayes, Support Vector Machine,and Random Forest. However, the classification algorithm cannotclassify data optimally due to the challenges in dealing with variousdata sets. Not all available featureswillmake a solidcontribution to the label of the data class, often in the form of noise or interference. For this reason, it is necessary to carry out a feature selection process. Currently, many feature selection processes have been carried out using correlation values from chi-square and gain-information, but the accuracy of the resultsis often still not good enough. This is because the chi-square and gain-information values are fixed. So,the selection of features is minimaland is not based on the previous learning process or what is known as heuristics. For this reason, in this research,several auxiliary algorithms are introduced to improve the performance of the classification algorithm, namely the meta-heuristic algorithm. Meta-heuristic algorithms are search techniques used to solve complexoptimization problems, and these algorithms can help provide reasonable solutions in a shorter time thanexact methods. In its operation, the metaheuristic algorithm optimizes the feature selection process,which will later be processed using the classification algorithm.Three (3) meta-heuristics were implemented, namely Genetic Algorithm, Particle Swarm Optimization, and Cuckoo Search Algorithm; the experiment was conducted, and the results were collected and analyzed. The result shows that combining Naive Bayes and Genetic Algorithmgives the best performance regarding higher accuracy improvementat +23.77%.","PeriodicalId":15636,"journal":{"name":"Journal of data science","volume":"89 S1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Classification Algorithms with Metaheuristic Technique\",\"authors\":\"\",\"doi\":\"10.61453/jods.v2024no22\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Classification is a process of grouping or placing data into appropriate categories or classes based on specificattributes or features to predict labels or classes of new data based on patternsobserved from previously trained data. Implementing this process uses classification algorithms such asNaïve Bayes, Support Vector Machine,and Random Forest. However, the classification algorithm cannotclassify data optimally due to the challenges in dealing with variousdata sets. Not all available featureswillmake a solidcontribution to the label of the data class, often in the form of noise or interference. For this reason, it is necessary to carry out a feature selection process. Currently, many feature selection processes have been carried out using correlation values from chi-square and gain-information, but the accuracy of the resultsis often still not good enough. This is because the chi-square and gain-information values are fixed. So,the selection of features is minimaland is not based on the previous learning process or what is known as heuristics. For this reason, in this research,several auxiliary algorithms are introduced to improve the performance of the classification algorithm, namely the meta-heuristic algorithm. Meta-heuristic algorithms are search techniques used to solve complexoptimization problems, and these algorithms can help provide reasonable solutions in a shorter time thanexact methods. In its operation, the metaheuristic algorithm optimizes the feature selection process,which will later be processed using the classification algorithm.Three (3) meta-heuristics were implemented, namely Genetic Algorithm, Particle Swarm Optimization, and Cuckoo Search Algorithm; the experiment was conducted, and the results were collected and analyzed. The result shows that combining Naive Bayes and Genetic Algorithmgives the best performance regarding higher accuracy improvementat +23.77%.\",\"PeriodicalId\":15636,\"journal\":{\"name\":\"Journal of data science\",\"volume\":\"89 S1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of data science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.61453/jods.v2024no22\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of data science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.61453/jods.v2024no22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

分类是根据特定属性或特征将数据分组或归入适当类别的过程,以便根据从以前训练的数据中观察到的模式预测新数据的标签或类别。实现这一过程需要使用分类算法,例如奈维贝叶斯、支持向量机和随机森林。然而,由于处理各种数据集的挑战,分类算法无法对数据进行最佳分类。并非所有可用的特征都会对数据类别的标签做出可靠的贡献,通常是以噪声或干扰的形式存在。因此,有必要进行特征选择处理。目前,很多特征选择过程都是利用气方和增益信息的相关值进行的,但结果的准确性往往仍然不够好。这是因为气方值和增益信息值是固定的。因此,特征的选择是最少的,而且不是基于以前的学习过程或所谓的启发式方法。因此,本研究引入了几种辅助算法来提高分类算法的性能,即元启发式算法。元启发式算法是一种用于解决复杂优化问题的搜索技术,与精确算法相比,元启发式算法能在更短的时间内提供合理的解决方案。在运行过程中,元启发式算法对特征选择过程进行优化,然后使用分类算法对其进行处理。我们实施了三(3)种元启发式算法,即遗传算法、粒子群优化和布谷鸟搜索算法,并对实验结果进行了收集和分析。结果表明,将 Naive Bayes 算法和遗传算法结合在一起的性能最佳,准确率提高了 23.77%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Enhancing Classification Algorithms with Metaheuristic Technique
Classification is a process of grouping or placing data into appropriate categories or classes based on specificattributes or features to predict labels or classes of new data based on patternsobserved from previously trained data. Implementing this process uses classification algorithms such asNaïve Bayes, Support Vector Machine,and Random Forest. However, the classification algorithm cannotclassify data optimally due to the challenges in dealing with variousdata sets. Not all available featureswillmake a solidcontribution to the label of the data class, often in the form of noise or interference. For this reason, it is necessary to carry out a feature selection process. Currently, many feature selection processes have been carried out using correlation values from chi-square and gain-information, but the accuracy of the resultsis often still not good enough. This is because the chi-square and gain-information values are fixed. So,the selection of features is minimaland is not based on the previous learning process or what is known as heuristics. For this reason, in this research,several auxiliary algorithms are introduced to improve the performance of the classification algorithm, namely the meta-heuristic algorithm. Meta-heuristic algorithms are search techniques used to solve complexoptimization problems, and these algorithms can help provide reasonable solutions in a shorter time thanexact methods. In its operation, the metaheuristic algorithm optimizes the feature selection process,which will later be processed using the classification algorithm.Three (3) meta-heuristics were implemented, namely Genetic Algorithm, Particle Swarm Optimization, and Cuckoo Search Algorithm; the experiment was conducted, and the results were collected and analyzed. The result shows that combining Naive Bayes and Genetic Algorithmgives the best performance regarding higher accuracy improvementat +23.77%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信