{"title":"Ensemble feature selection using weighted concatenated voting for text classification","authors":"O. Ige, K. H. Gan","doi":"10.46481/jnsps.2024.1823","DOIUrl":null,"url":null,"abstract":"Following the increasing number of high dimensional data, selecting relevant features has always been better handled by filter feature selection techniques due to its improved generalization, faster training time, dimensionality reduction, less prone to overfitting, and improved model performance. However, the most used feature selection methods are unstable; a feature selection method chooses different subsets of characteristics that produce different classification accuracy. Selecting an appropriate hybrid harnesses the local feature relevant to the discriminative power of filter methods for improved text classification, which is lacking in past literature. In this paper, we proposed a novel multi-univariate hybrid feature selection method (MUNIFES) for enhanced discriminative power between the features and the target class. The proposed method utilizes multi-iterative processes to select the best feature sets from each univariate feature selection method. MUNIFES has employed the ensemble of multi-filter discriminative strength of Chi-Square (Chi2), Analysis of Variance (ANOVA), and Infogain methods to select optimal feature subsets. To evaluate the success of the proposed method, several experiments were performed on the 20newsgroup dataset and its variant (17newsgroup) with 10 classifiers (including ensemble, classification and optimization algorithms, and Artificial Neural Network (ANN)), compared with the state-of-the-art feature selection methods. The MUNIFES results indicated a better accuracy classification performance.","PeriodicalId":342917,"journal":{"name":"Journal of the Nigerian Society of Physical Sciences","volume":"5 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Nigerian Society of Physical Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.46481/jnsps.2024.1823","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Following the increasing number of high dimensional data, selecting relevant features has always been better handled by filter feature selection techniques due to its improved generalization, faster training time, dimensionality reduction, less prone to overfitting, and improved model performance. However, the most used feature selection methods are unstable; a feature selection method chooses different subsets of characteristics that produce different classification accuracy. Selecting an appropriate hybrid harnesses the local feature relevant to the discriminative power of filter methods for improved text classification, which is lacking in past literature. In this paper, we proposed a novel multi-univariate hybrid feature selection method (MUNIFES) for enhanced discriminative power between the features and the target class. The proposed method utilizes multi-iterative processes to select the best feature sets from each univariate feature selection method. MUNIFES has employed the ensemble of multi-filter discriminative strength of Chi-Square (Chi2), Analysis of Variance (ANOVA), and Infogain methods to select optimal feature subsets. To evaluate the success of the proposed method, several experiments were performed on the 20newsgroup dataset and its variant (17newsgroup) with 10 classifiers (including ensemble, classification and optimization algorithms, and Artificial Neural Network (ANN)), compared with the state-of-the-art feature selection methods. The MUNIFES results indicated a better accuracy classification performance.