A Multi-Objective Hybrid Filter-Wrapper Evolutionary Approach for Feature Construction on High-Dimensional Data

Marwa Hammami, Slim Bechikh, C. Hung, L. B. Said
{"title":"A Multi-Objective Hybrid Filter-Wrapper Evolutionary Approach for Feature Construction on High-Dimensional Data","authors":"Marwa Hammami, Slim Bechikh, C. Hung, L. B. Said","doi":"10.1109/CEC.2018.8477771","DOIUrl":null,"url":null,"abstract":"Feature selection and construction are important pre-processing techniques in data mining. They may allow not only dimensionality reduction but also classifier accuracy and efficiency improvement. These two techniques are of great importance especially for the case of high-dimensional data. Feature construction for high-dimensional data is still a very challenging topic. This can be explained by the large search space of feature combinations, whose size is a function of the number of features. Recently, researchers have used Genetic Programming (GP) for feature construction and the obtained results were promising. Unfortunately, the wrapper evaluation of each feature subset, where a feature can be constructed by a combination of features, is computationally intensive since such evaluation requires running the classifier on the data sets. Motivated by this observation, we propose, in this paper, a hybrid multiobjective evolutionary approach for efficient feature construction and selection. Our approach uses two filter objectives and one wrapper objective corresponding to the accuracy. In fact, the whole population is evaluated using two filter objectives. However, only non-dominated (best) feature subsets are improved using an indicator-based local search that optimizes the three objectives simultaneously. Our approach has been assessed on six high-dimensional datasets and compared with two existing prominent GP approaches, using three different classifiers for accuracy evaluation. Based on the obtained results, our approach is shown to provide competitive and better results compared with two competitor GP algorithms tested in this study.","PeriodicalId":212677,"journal":{"name":"2018 IEEE Congress on Evolutionary Computation (CEC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Congress on Evolutionary Computation (CEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2018.8477771","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Feature selection and construction are important pre-processing techniques in data mining. They may allow not only dimensionality reduction but also classifier accuracy and efficiency improvement. These two techniques are of great importance especially for the case of high-dimensional data. Feature construction for high-dimensional data is still a very challenging topic. This can be explained by the large search space of feature combinations, whose size is a function of the number of features. Recently, researchers have used Genetic Programming (GP) for feature construction and the obtained results were promising. Unfortunately, the wrapper evaluation of each feature subset, where a feature can be constructed by a combination of features, is computationally intensive since such evaluation requires running the classifier on the data sets. Motivated by this observation, we propose, in this paper, a hybrid multiobjective evolutionary approach for efficient feature construction and selection. Our approach uses two filter objectives and one wrapper objective corresponding to the accuracy. In fact, the whole population is evaluated using two filter objectives. However, only non-dominated (best) feature subsets are improved using an indicator-based local search that optimizes the three objectives simultaneously. Our approach has been assessed on six high-dimensional datasets and compared with two existing prominent GP approaches, using three different classifiers for accuracy evaluation. Based on the obtained results, our approach is shown to provide competitive and better results compared with two competitor GP algorithms tested in this study.
高维数据特征构建的多目标混合滤波-包装进化方法
特征选择和构造是数据挖掘中重要的预处理技术。它们不仅可以降低维数,还可以提高分类器的准确性和效率。这两种技术对于高维数据尤其重要。高维数据的特征构建仍然是一个非常具有挑战性的课题。这可以解释为特征组合的搜索空间很大,其大小是特征数量的函数。近年来,研究人员将遗传规划(GP)用于特征构建,并取得了良好的结果。不幸的是,每个特征子集的包装器评估是计算密集型的,因为这种评估需要在数据集上运行分类器。基于这一观察结果,我们提出了一种混合多目标进化方法,用于高效的特征构建和选择。我们的方法使用两个过滤器目标和一个包装器目标对应于精度。事实上,整个群体是用两个过滤目标来评估的。然而,只有非支配(最佳)的特征子集被改进使用指示器为基础的局部搜索,同时优化三个目标。我们的方法已经在六个高维数据集上进行了评估,并与两种现有的著名GP方法进行了比较,使用三种不同的分类器进行准确性评估。根据获得的结果,与本研究中测试的两种竞争对手的GP算法相比,我们的方法显示出具有竞争力和更好的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信