一种基于进化计算的特征子集选择滤波器评价函数

Atsushi Kawamura, B. Chakraborty
{"title":"一种基于进化计算的特征子集选择滤波器评价函数","authors":"Atsushi Kawamura, B. Chakraborty","doi":"10.1109/ICAWST.2018.8517241","DOIUrl":null,"url":null,"abstract":"Feature subset selection is an optimization problem to achieve high classification accuracy with low number of features and low computational cost in the area of pattern classi- fication or data mining. There are various approaches to obtain this. Basically a search algorithm is used with a fitness function either based on intrinsic characteristics of the data, known as filter type, or based on classification accuracy of the classifier used, known as the wrapper type, to find out the optimum feature subset. Both the approaches have respective merits and demerits. Though lots of algorithms are developed so far, none of them works equally well for all the data sets, specially for very high dimensional data sets. In this work, a new feature evaluation measure based on the concept borrowed from topic modelling in text processing, has been developed. The proposed measure is used as a fitness function of evolutionary computational search techniques for designing filter type feature subset selection approach. Simulation experiments with various benchmark data sets have been done for assessing the efficiency of the proposed approach in comparison to the popular conventional filter type feature selection algorithms mRMR and CFS. It is found that the proposed approach is better in terms of selecting lesser number of features with comparable classification accuracy. The proposed algorithms work better for higher dimensional features and can be proved as an effective solution of feature selection for very high dimensional data.","PeriodicalId":277939,"journal":{"name":"2018 9th International Conference on Awareness Science and Technology (iCAST)","volume":"21 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A New Filter Evaluation Function for Feature Subset Selection with Evolutionary Computation\",\"authors\":\"Atsushi Kawamura, B. Chakraborty\",\"doi\":\"10.1109/ICAWST.2018.8517241\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature subset selection is an optimization problem to achieve high classification accuracy with low number of features and low computational cost in the area of pattern classi- fication or data mining. There are various approaches to obtain this. Basically a search algorithm is used with a fitness function either based on intrinsic characteristics of the data, known as filter type, or based on classification accuracy of the classifier used, known as the wrapper type, to find out the optimum feature subset. Both the approaches have respective merits and demerits. Though lots of algorithms are developed so far, none of them works equally well for all the data sets, specially for very high dimensional data sets. In this work, a new feature evaluation measure based on the concept borrowed from topic modelling in text processing, has been developed. The proposed measure is used as a fitness function of evolutionary computational search techniques for designing filter type feature subset selection approach. Simulation experiments with various benchmark data sets have been done for assessing the efficiency of the proposed approach in comparison to the popular conventional filter type feature selection algorithms mRMR and CFS. It is found that the proposed approach is better in terms of selecting lesser number of features with comparable classification accuracy. The proposed algorithms work better for higher dimensional features and can be proved as an effective solution of feature selection for very high dimensional data.\",\"PeriodicalId\":277939,\"journal\":{\"name\":\"2018 9th International Conference on Awareness Science and Technology (iCAST)\",\"volume\":\"21 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 9th International Conference on Awareness Science and Technology (iCAST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAWST.2018.8517241\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 9th International Conference on Awareness Science and Technology (iCAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAWST.2018.8517241","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

特征子集选择是模式分类或数据挖掘领域中以较少的特征数量和较低的计算成本实现高分类精度的优化问题。有不同的方法可以得到它。基本上,搜索算法是与适应度函数一起使用的,或者基于数据的内在特征(称为过滤器类型),或者基于所使用的分类器的分类精度(称为包装器类型),以找出最优的特征子集。这两种方法各有优缺点。虽然目前已经开发了很多算法,但是没有一种算法能够很好地适用于所有的数据集,特别是对于非常高维的数据集。本文借鉴了文本处理中的主题建模概念,提出了一种新的特征评价方法。将该测度作为进化计算搜索技术的适应度函数,用于设计滤波器类型特征子集选择方法。与流行的传统滤波器类型特征选择算法mRMR和CFS相比,使用各种基准数据集进行了仿真实验,以评估所提出方法的效率。结果表明,该方法在选择较少数量的特征并具有相当的分类精度方面取得了较好的效果。该算法对高维数据的特征选择效果较好,是解决高维数据特征选择问题的有效方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A New Filter Evaluation Function for Feature Subset Selection with Evolutionary Computation
Feature subset selection is an optimization problem to achieve high classification accuracy with low number of features and low computational cost in the area of pattern classi- fication or data mining. There are various approaches to obtain this. Basically a search algorithm is used with a fitness function either based on intrinsic characteristics of the data, known as filter type, or based on classification accuracy of the classifier used, known as the wrapper type, to find out the optimum feature subset. Both the approaches have respective merits and demerits. Though lots of algorithms are developed so far, none of them works equally well for all the data sets, specially for very high dimensional data sets. In this work, a new feature evaluation measure based on the concept borrowed from topic modelling in text processing, has been developed. The proposed measure is used as a fitness function of evolutionary computational search techniques for designing filter type feature subset selection approach. Simulation experiments with various benchmark data sets have been done for assessing the efficiency of the proposed approach in comparison to the popular conventional filter type feature selection algorithms mRMR and CFS. It is found that the proposed approach is better in terms of selecting lesser number of features with comparable classification accuracy. The proposed algorithms work better for higher dimensional features and can be proved as an effective solution of feature selection for very high dimensional data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信