A differential evolution approach to dimensionality reduction for classification needs

International Journal of Applied Mathematics and Computer Sciences Pub Date : 2014-03-01 DOI:10.2478/amcs-2014-0009

G. Martinović, Drazen Bajer, Bruno Zoric

{"title":"A differential evolution approach to dimensionality reduction for classification needs","authors":"G. Martinović, Drazen Bajer, Bruno Zoric","doi":"10.2478/amcs-2014-0009","DOIUrl":null,"url":null,"abstract":"Abstract The feature selection problem often occurs in pattern recognition and, more specifically, classification. Although these patterns could contain a large number of features, some of them could prove to be irrelevant, redundant or even detrimental to classification accuracy. Thus, it is important to remove these kinds of features, which in turn leads to problem dimensionality reduction and could eventually improve the classification accuracy. In this paper an approach to dimensionality reduction based on differential evolution which represents a wrapper and explores the solution space is presented. The solutions, subsets of the whole feature set, are evaluated using the k-nearest neighbour algorithm. High quality solutions found during execution of the differential evolution fill the archive. A final solution is obtained by conducting k-fold crossvalidation on the archive solutions and selecting the best one. Experimental analysis is conducted on several standard test sets. The classification accuracy of the k-nearest neighbour algorithm using the full feature set and the accuracy of the same algorithm using only the subset provided by the proposed approach and some other optimization algorithms which were used as wrappers are compared. The analysis shows that the proposed approach successfully determines good feature subsets which may increase the classification accuracy.","PeriodicalId":253470,"journal":{"name":"International Journal of Applied Mathematics and Computer Sciences","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Applied Mathematics and Computer Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/amcs-2014-0009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 28

Abstract

Abstract The feature selection problem often occurs in pattern recognition and, more specifically, classification. Although these patterns could contain a large number of features, some of them could prove to be irrelevant, redundant or even detrimental to classification accuracy. Thus, it is important to remove these kinds of features, which in turn leads to problem dimensionality reduction and could eventually improve the classification accuracy. In this paper an approach to dimensionality reduction based on differential evolution which represents a wrapper and explores the solution space is presented. The solutions, subsets of the whole feature set, are evaluated using the k-nearest neighbour algorithm. High quality solutions found during execution of the differential evolution fill the archive. A final solution is obtained by conducting k-fold crossvalidation on the archive solutions and selecting the best one. Experimental analysis is conducted on several standard test sets. The classification accuracy of the k-nearest neighbour algorithm using the full feature set and the accuracy of the same algorithm using only the subset provided by the proposed approach and some other optimization algorithms which were used as wrappers are compared. The analysis shows that the proposed approach successfully determines good feature subsets which may increase the classification accuracy.

查看原文本刊更多论文

基于分类需求的降维差分进化方法

特征选择问题经常出现在模式识别中，更具体地说，是分类问题。尽管这些模式可能包含大量的特征，但其中一些特征可能被证明是不相关的、冗余的，甚至对分类准确性有害。因此，去除这些类型的特征是很重要的，这反过来会导致问题的降维，最终可以提高分类精度。本文提出了一种基于差分进化的降维方法，该方法代表了一个包装器并探索了解空间。解决方案，整个特征集的子集，使用k近邻算法进行评估。在执行差异演化过程中发现的高质量解决方案填充了归档。通过对存档解进行k倍交叉验证，选出最优解，得到最终解。在几个标准测试集上进行了实验分析。比较了使用完整特征集的k近邻算法的分类精度和仅使用该方法提供的子集的分类精度，以及使用其他一些优化算法作为包装器。分析表明，该方法成功地确定了较好的特征子集，提高了分类精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Applied Mathematics and Computer Sciences

自引率

0.00%

发文量