{"title":"基于二元Harris Hawks优化滤波器的特征选择方法","authors":"Ruba Abu Khurma, M. Awadallah, Ibrahim Aljarah","doi":"10.1109/PICICT53635.2021.00022","DOIUrl":null,"url":null,"abstract":"Feature Selection (FS) is a technique to reduce the dimensionality of datasets by eliminating irrelevant and redundant features to enhance the performance of the data mining tasks. Meta-heuristic algorithms are promising search engines to traverse the feature space to find a (near) optimal feature subset. Harris hawks optimization (HHO) algorithm is a recently developed meta-heuristic algorithm which is inspired from the hunting strategy of hawk in nature. The main contribution of this paper is that it proposes two new filter based methods for applying FS in classification problems. The methods integrate the information theory with an HHO algorithm. The first method applies the HHO with the mutual information between any two features. The second method applies the HHO with the entropy of each group of features. The adopted fitness function enhances the performance based on both the number of selected features and the classification accuracy. It gives different weights for relevance and redundancy. The results of the experiments show that with proper weights, the two proposed methods can significantly reduce the number of selected features and achieve a higher classification accuracy in most of the datasets. The first method usually selects a smaller feature subset, while the second method can achieve higher classification accuracy.","PeriodicalId":308869,"journal":{"name":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Binary Harris Hawks Optimisation Filter Based Approach for Feature Selection\",\"authors\":\"Ruba Abu Khurma, M. Awadallah, Ibrahim Aljarah\",\"doi\":\"10.1109/PICICT53635.2021.00022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature Selection (FS) is a technique to reduce the dimensionality of datasets by eliminating irrelevant and redundant features to enhance the performance of the data mining tasks. Meta-heuristic algorithms are promising search engines to traverse the feature space to find a (near) optimal feature subset. Harris hawks optimization (HHO) algorithm is a recently developed meta-heuristic algorithm which is inspired from the hunting strategy of hawk in nature. The main contribution of this paper is that it proposes two new filter based methods for applying FS in classification problems. The methods integrate the information theory with an HHO algorithm. The first method applies the HHO with the mutual information between any two features. The second method applies the HHO with the entropy of each group of features. The adopted fitness function enhances the performance based on both the number of selected features and the classification accuracy. It gives different weights for relevance and redundancy. The results of the experiments show that with proper weights, the two proposed methods can significantly reduce the number of selected features and achieve a higher classification accuracy in most of the datasets. The first method usually selects a smaller feature subset, while the second method can achieve higher classification accuracy.\",\"PeriodicalId\":308869,\"journal\":{\"name\":\"2021 Palestinian International Conference on Information and Communication Technology (PICICT)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Palestinian International Conference on Information and Communication Technology (PICICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PICICT53635.2021.00022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PICICT53635.2021.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Binary Harris Hawks Optimisation Filter Based Approach for Feature Selection
Feature Selection (FS) is a technique to reduce the dimensionality of datasets by eliminating irrelevant and redundant features to enhance the performance of the data mining tasks. Meta-heuristic algorithms are promising search engines to traverse the feature space to find a (near) optimal feature subset. Harris hawks optimization (HHO) algorithm is a recently developed meta-heuristic algorithm which is inspired from the hunting strategy of hawk in nature. The main contribution of this paper is that it proposes two new filter based methods for applying FS in classification problems. The methods integrate the information theory with an HHO algorithm. The first method applies the HHO with the mutual information between any two features. The second method applies the HHO with the entropy of each group of features. The adopted fitness function enhances the performance based on both the number of selected features and the classification accuracy. It gives different weights for relevance and redundancy. The results of the experiments show that with proper weights, the two proposed methods can significantly reduce the number of selected features and achieve a higher classification accuracy in most of the datasets. The first method usually selects a smaller feature subset, while the second method can achieve higher classification accuracy.