基于拥挤距离和Pearson相关系数的特征选择新算法

Q3 Computer Science
Abdesslem Layeb
{"title":"基于拥挤距离和Pearson相关系数的特征选择新算法","authors":"Abdesslem Layeb","doi":"10.5815/ijisa.2023.02.04","DOIUrl":null,"url":null,"abstract":"Feature Selection is an important phase in classification models. Feature Selection is an effective task used to decrease the dimensionality and eliminate redundant and unrelated features. In this paper, three novel algorithms for feature selection problem are proposed. The first one is a filter method, the second one is a wrapper method, and the last one is a hybrid filter method. Both the proposed algorithms use the crowding distance used in the multiobjective optimization as a new metric to assess the importance of the features. The idea behind the use of the crowding distance is that the less crowded features have great impacts on the target attribute (class), and the crowded features have generally the same impact on the class attribute. To enhance the crowded distance, a combination with other metrics will give good results. In this work, the hybrid method combines between the crowding distance and Pearson correlation coefficient to well order the importance of features. Experiments on well-known benchmark datasets including large microarray datasets have shown the effectiveness and the robustness of the proposed algorithms.","PeriodicalId":14067,"journal":{"name":"International Journal of Intelligent Systems and Applications in Engineering","volume":"112 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Novel Feature Selection Algorithms Based on Crowding Distance and Pearson Correlation Coefficient\",\"authors\":\"Abdesslem Layeb\",\"doi\":\"10.5815/ijisa.2023.02.04\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature Selection is an important phase in classification models. Feature Selection is an effective task used to decrease the dimensionality and eliminate redundant and unrelated features. In this paper, three novel algorithms for feature selection problem are proposed. The first one is a filter method, the second one is a wrapper method, and the last one is a hybrid filter method. Both the proposed algorithms use the crowding distance used in the multiobjective optimization as a new metric to assess the importance of the features. The idea behind the use of the crowding distance is that the less crowded features have great impacts on the target attribute (class), and the crowded features have generally the same impact on the class attribute. To enhance the crowded distance, a combination with other metrics will give good results. In this work, the hybrid method combines between the crowding distance and Pearson correlation coefficient to well order the importance of features. Experiments on well-known benchmark datasets including large microarray datasets have shown the effectiveness and the robustness of the proposed algorithms.\",\"PeriodicalId\":14067,\"journal\":{\"name\":\"International Journal of Intelligent Systems and Applications in Engineering\",\"volume\":\"112 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Intelligent Systems and Applications in Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5815/ijisa.2023.02.04\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Systems and Applications in Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5815/ijisa.2023.02.04","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

摘要

特征选择是分类模型的一个重要阶段。特征选择是一种有效的降维、剔除冗余和不相关特征的方法。本文提出了三种新的特征选择算法。第一个是过滤器方法,第二个是包装器方法,最后一个是混合过滤器方法。这两种算法都使用多目标优化中使用的拥挤距离作为评估特征重要性的新度量。使用拥挤距离背后的思想是,较少拥挤的特征对目标属性(类)的影响很大,而拥挤的特征对类属性的影响大致相同。为了增强拥挤距离,与其他指标相结合将获得良好的效果。在本研究中,混合方法结合了拥挤距离和Pearson相关系数,很好地排序了特征的重要性。在包括大型微阵列数据集在内的知名基准数据集上的实验证明了所提算法的有效性和鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Novel Feature Selection Algorithms Based on Crowding Distance and Pearson Correlation Coefficient
Feature Selection is an important phase in classification models. Feature Selection is an effective task used to decrease the dimensionality and eliminate redundant and unrelated features. In this paper, three novel algorithms for feature selection problem are proposed. The first one is a filter method, the second one is a wrapper method, and the last one is a hybrid filter method. Both the proposed algorithms use the crowding distance used in the multiobjective optimization as a new metric to assess the importance of the features. The idea behind the use of the crowding distance is that the less crowded features have great impacts on the target attribute (class), and the crowded features have generally the same impact on the class attribute. To enhance the crowded distance, a combination with other metrics will give good results. In this work, the hybrid method combines between the crowding distance and Pearson correlation coefficient to well order the importance of features. Experiments on well-known benchmark datasets including large microarray datasets have shown the effectiveness and the robustness of the proposed algorithms.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Intelligent Systems and Applications in Engineering
International Journal of Intelligent Systems and Applications in Engineering Computer Science-Computer Graphics and Computer-Aided Design
CiteScore
1.30
自引率
0.00%
发文量
18
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信