{"title":"FeatureX:用于深度学习的可解释特征选择","authors":"Siyi Liang , Yang Zhang , Kun Zheng, Yu Bai","doi":"10.1016/j.eswa.2025.127675","DOIUrl":null,"url":null,"abstract":"<div><div>Feature selection is critical for the performance of deep learning models by reducing the dimensionality of feature sets to understand the features’ importance. Existing techniques focus on the statistical characteristics of different features, which makes them hard to understand due to complicated mathematical reasoning. Furthermore, feature selection can be impacted by model preferences, resulting in a lack of explainability. To this end, this paper proposes an effective method called FeatureX to obtain the optimal feature subset and enhance the explainability of the feature selection process through quantitative evaluation. Firstly, FeatureX proposes importance analysis to quantify the contribution of each feature to the deep learning model by leveraging feature perturbation. Secondly, to mitigate the multicollinearity, FeatureX employs statistical analysis to calculate the correlation coefficients of these features and removes redundant features based on the magnitude of the correlation coefficients. Finally, with the feature contribution and correlation coefficients, FeatureX screens these features automatically to identify the most relevant and high-contribution features. Based on existing research and prior knowledge of the data, FeatureX presets the values of relevant parameters and demonstrates their effectiveness through parameter sensitivity analysis. FeatureX is evaluated on 17 public datasets with 5 fundamental deep learning models. Experimental results show that FeatureX can reduce the number of features by an average of 47.83% and the accuracy of 63.33% deep learning models are improved. Furthermore, when comparing against the existing feature selection techniques, FeatureX improves the F-measure by an average of 1.61%, demonstrating its effectiveness.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127675"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FeatureX: An explainable feature selection for deep learning\",\"authors\":\"Siyi Liang , Yang Zhang , Kun Zheng, Yu Bai\",\"doi\":\"10.1016/j.eswa.2025.127675\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Feature selection is critical for the performance of deep learning models by reducing the dimensionality of feature sets to understand the features’ importance. Existing techniques focus on the statistical characteristics of different features, which makes them hard to understand due to complicated mathematical reasoning. Furthermore, feature selection can be impacted by model preferences, resulting in a lack of explainability. To this end, this paper proposes an effective method called FeatureX to obtain the optimal feature subset and enhance the explainability of the feature selection process through quantitative evaluation. Firstly, FeatureX proposes importance analysis to quantify the contribution of each feature to the deep learning model by leveraging feature perturbation. Secondly, to mitigate the multicollinearity, FeatureX employs statistical analysis to calculate the correlation coefficients of these features and removes redundant features based on the magnitude of the correlation coefficients. Finally, with the feature contribution and correlation coefficients, FeatureX screens these features automatically to identify the most relevant and high-contribution features. Based on existing research and prior knowledge of the data, FeatureX presets the values of relevant parameters and demonstrates their effectiveness through parameter sensitivity analysis. FeatureX is evaluated on 17 public datasets with 5 fundamental deep learning models. Experimental results show that FeatureX can reduce the number of features by an average of 47.83% and the accuracy of 63.33% deep learning models are improved. Furthermore, when comparing against the existing feature selection techniques, FeatureX improves the F-measure by an average of 1.61%, demonstrating its effectiveness.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"282 \",\"pages\":\"Article 127675\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425012977\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425012977","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
FeatureX: An explainable feature selection for deep learning
Feature selection is critical for the performance of deep learning models by reducing the dimensionality of feature sets to understand the features’ importance. Existing techniques focus on the statistical characteristics of different features, which makes them hard to understand due to complicated mathematical reasoning. Furthermore, feature selection can be impacted by model preferences, resulting in a lack of explainability. To this end, this paper proposes an effective method called FeatureX to obtain the optimal feature subset and enhance the explainability of the feature selection process through quantitative evaluation. Firstly, FeatureX proposes importance analysis to quantify the contribution of each feature to the deep learning model by leveraging feature perturbation. Secondly, to mitigate the multicollinearity, FeatureX employs statistical analysis to calculate the correlation coefficients of these features and removes redundant features based on the magnitude of the correlation coefficients. Finally, with the feature contribution and correlation coefficients, FeatureX screens these features automatically to identify the most relevant and high-contribution features. Based on existing research and prior knowledge of the data, FeatureX presets the values of relevant parameters and demonstrates their effectiveness through parameter sensitivity analysis. FeatureX is evaluated on 17 public datasets with 5 fundamental deep learning models. Experimental results show that FeatureX can reduce the number of features by an average of 47.83% and the accuracy of 63.33% deep learning models are improved. Furthermore, when comparing against the existing feature selection techniques, FeatureX improves the F-measure by an average of 1.61%, demonstrating its effectiveness.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.