基于二元Harris Hawks优化滤波器的特征选择方法

Ruba Abu Khurma, M. Awadallah, Ibrahim Aljarah
{"title":"基于二元Harris Hawks优化滤波器的特征选择方法","authors":"Ruba Abu Khurma, M. Awadallah, Ibrahim Aljarah","doi":"10.1109/PICICT53635.2021.00022","DOIUrl":null,"url":null,"abstract":"Feature Selection (FS) is a technique to reduce the dimensionality of datasets by eliminating irrelevant and redundant features to enhance the performance of the data mining tasks. Meta-heuristic algorithms are promising search engines to traverse the feature space to find a (near) optimal feature subset. Harris hawks optimization (HHO) algorithm is a recently developed meta-heuristic algorithm which is inspired from the hunting strategy of hawk in nature. The main contribution of this paper is that it proposes two new filter based methods for applying FS in classification problems. The methods integrate the information theory with an HHO algorithm. The first method applies the HHO with the mutual information between any two features. The second method applies the HHO with the entropy of each group of features. The adopted fitness function enhances the performance based on both the number of selected features and the classification accuracy. It gives different weights for relevance and redundancy. The results of the experiments show that with proper weights, the two proposed methods can significantly reduce the number of selected features and achieve a higher classification accuracy in most of the datasets. The first method usually selects a smaller feature subset, while the second method can achieve higher classification accuracy.","PeriodicalId":308869,"journal":{"name":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Binary Harris Hawks Optimisation Filter Based Approach for Feature Selection\",\"authors\":\"Ruba Abu Khurma, M. Awadallah, Ibrahim Aljarah\",\"doi\":\"10.1109/PICICT53635.2021.00022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature Selection (FS) is a technique to reduce the dimensionality of datasets by eliminating irrelevant and redundant features to enhance the performance of the data mining tasks. Meta-heuristic algorithms are promising search engines to traverse the feature space to find a (near) optimal feature subset. Harris hawks optimization (HHO) algorithm is a recently developed meta-heuristic algorithm which is inspired from the hunting strategy of hawk in nature. The main contribution of this paper is that it proposes two new filter based methods for applying FS in classification problems. The methods integrate the information theory with an HHO algorithm. The first method applies the HHO with the mutual information between any two features. The second method applies the HHO with the entropy of each group of features. The adopted fitness function enhances the performance based on both the number of selected features and the classification accuracy. It gives different weights for relevance and redundancy. The results of the experiments show that with proper weights, the two proposed methods can significantly reduce the number of selected features and achieve a higher classification accuracy in most of the datasets. The first method usually selects a smaller feature subset, while the second method can achieve higher classification accuracy.\",\"PeriodicalId\":308869,\"journal\":{\"name\":\"2021 Palestinian International Conference on Information and Communication Technology (PICICT)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Palestinian International Conference on Information and Communication Technology (PICICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PICICT53635.2021.00022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PICICT53635.2021.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

特征选择(FS)是一种通过消除不相关和冗余特征来降低数据集维数以提高数据挖掘任务性能的技术。元启发式算法使搜索引擎能够遍历特征空间以找到(接近)最优的特征子集。哈里斯鹰优化算法(HHO)是受自然界中鹰的捕食策略启发而发展起来的一种元启发式算法。本文的主要贡献是提出了两种新的基于滤波器的方法来将FS应用于分类问题。该方法将信息论与HHO算法相结合。第一种方法是利用任意两个特征之间的互信息进行HHO。第二种方法是将HHO与每组特征的熵相结合。所采用的适应度函数基于所选特征的数量和分类精度来提高性能。它为相关性和冗余提供了不同的权重。实验结果表明,在适当的权重下,这两种方法可以显著减少特征的选择数量,在大多数数据集上都能达到较高的分类精度。第一种方法通常选择较小的特征子集,而第二种方法可以获得更高的分类精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Binary Harris Hawks Optimisation Filter Based Approach for Feature Selection
Feature Selection (FS) is a technique to reduce the dimensionality of datasets by eliminating irrelevant and redundant features to enhance the performance of the data mining tasks. Meta-heuristic algorithms are promising search engines to traverse the feature space to find a (near) optimal feature subset. Harris hawks optimization (HHO) algorithm is a recently developed meta-heuristic algorithm which is inspired from the hunting strategy of hawk in nature. The main contribution of this paper is that it proposes two new filter based methods for applying FS in classification problems. The methods integrate the information theory with an HHO algorithm. The first method applies the HHO with the mutual information between any two features. The second method applies the HHO with the entropy of each group of features. The adopted fitness function enhances the performance based on both the number of selected features and the classification accuracy. It gives different weights for relevance and redundancy. The results of the experiments show that with proper weights, the two proposed methods can significantly reduce the number of selected features and achieve a higher classification accuracy in most of the datasets. The first method usually selects a smaller feature subset, while the second method can achieve higher classification accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信