Feature selection based on bootstrapping

2005 ICSC Congress on Computational Intelligence Methods and Applications Pub Date : 2005-12-15 DOI:10.1109/CIMA.2005.1662338

N. Díaz-Díaz, J. Aguilar-Ruiz, Juan A. Nepomuceno, Jorge García

引用次数: 6

Abstract

The results of feature selection methods have a great influence on the success of data mining processes, especially when the data sets have high dimensionality. In order to find the optimal result from feature selection methods, we should check each possible subset of features to obtain the precision on classification, i.e., an exhaustive search through the search space. However, it is an unfeasible task due to its computational complexity. In this paper, we propose a novel method of feature selection based on bootstrapping techniques. Our approach shows that it is not necessary to try every subset of features, but only a very small subset of combinations to achieve the same performance as the exhaustive approach. The experiments have been carried out using very high-dimensional datasets (thousands of features) and they show that it is possible to maintain the precision at the same time that the complexity is reduced substantially

查看原文本刊更多论文

基于自举的特征选择

特征选择方法的结果对数据挖掘过程的成功与否有很大的影响，特别是当数据集具有高维时。为了从特征选择方法中找到最优的结果，我们需要检查每个可能的特征子集，以获得分类的精度，即在搜索空间中进行穷举搜索。然而，由于其计算复杂性，这是一项不可行的任务。本文提出了一种基于自举技术的特征选择方法。我们的方法表明，没有必要尝试每个特征子集，而只需尝试非常小的组合子集即可达到与穷举方法相同的性能。实验已经使用非常高维的数据集(数千个特征)进行，结果表明，在保持精度的同时，复杂性大大降低是可能的

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2005 ICSC Congress on Computational Intelligence Methods and Applications

自引率

0.00%

发文量