{"title":"DCB-VIM:一种基于集成学习的类分布不平衡特征选择滤波方法","authors":"Nayiri Galestian Pour , Soudabeh Shemehsavar","doi":"10.1016/j.neucom.2025.130848","DOIUrl":null,"url":null,"abstract":"<div><div>Feature selection aims to improve predictive performance and interpretability in the analysis of datasets with high dimensional feature spaces. Imbalanced class distribution can make the process of feature selection more severe. Robust methodologies are essential for dealing with this case. Therefore, we present a filter method based on ensemble learning, in which each classifier is built on randomly selected subspaces of features. Variable importance measure is computed based on a class-wise procedure within each classifier, and a feature weighting procedure is subsequently applied. The performance of classifiers is considered in the combination phase of the ensemble learning. Different choices of hyperparameters consisting of the subspace size and the number of classification trees are investigated through simulation studies for determining their effects on the predictive performance. The efficiency of the proposed method is evaluated with respect to predictive performance by different selection strategies based on real data analysis in the presence of class imbalance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130848"},"PeriodicalIF":5.5000,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DCB-VIM: An ensemble learning based filter method for feature selection with imbalanced class distribution\",\"authors\":\"Nayiri Galestian Pour , Soudabeh Shemehsavar\",\"doi\":\"10.1016/j.neucom.2025.130848\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Feature selection aims to improve predictive performance and interpretability in the analysis of datasets with high dimensional feature spaces. Imbalanced class distribution can make the process of feature selection more severe. Robust methodologies are essential for dealing with this case. Therefore, we present a filter method based on ensemble learning, in which each classifier is built on randomly selected subspaces of features. Variable importance measure is computed based on a class-wise procedure within each classifier, and a feature weighting procedure is subsequently applied. The performance of classifiers is considered in the combination phase of the ensemble learning. Different choices of hyperparameters consisting of the subspace size and the number of classification trees are investigated through simulation studies for determining their effects on the predictive performance. The efficiency of the proposed method is evaluated with respect to predictive performance by different selection strategies based on real data analysis in the presence of class imbalance.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"651 \",\"pages\":\"Article 130848\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-07-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225015206\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225015206","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
DCB-VIM: An ensemble learning based filter method for feature selection with imbalanced class distribution
Feature selection aims to improve predictive performance and interpretability in the analysis of datasets with high dimensional feature spaces. Imbalanced class distribution can make the process of feature selection more severe. Robust methodologies are essential for dealing with this case. Therefore, we present a filter method based on ensemble learning, in which each classifier is built on randomly selected subspaces of features. Variable importance measure is computed based on a class-wise procedure within each classifier, and a feature weighting procedure is subsequently applied. The performance of classifiers is considered in the combination phase of the ensemble learning. Different choices of hyperparameters consisting of the subspace size and the number of classification trees are investigated through simulation studies for determining their effects on the predictive performance. The efficiency of the proposed method is evaluated with respect to predictive performance by different selection strategies based on real data analysis in the presence of class imbalance.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.