The Relationship between Generalization Error and the Training Sample Number of SVM

2009 Fifth International Conference on Natural Computation Pub Date : 2009-08-14 DOI:10.1109/ICNC.2009.479

Junqing Bai, Guirong Yan, Wentao Mao

引用次数: 0

Abstract

It is very important to construct the training set and determine the sample number in the regression problem. In this paper, a new idea of constructing the training set is elaborated. The key point of this idea is to choose the hyper-parameters before determining the training set. More importantly, a heuristic approach is proposed to select samples of support vector machine (SVM). Using these methods, the relationship between generalization error and the number of training samples on a given confidence level is computed. The empirical results on benchmark data (Boston Housing) and engineering data indicate that the proposed approach can give a reference to construct the proper training set. Moreover, the proposed approach has practical significance for other parametric learning machine.

查看原文本刊更多论文

支持向量机泛化误差与训练样本数的关系

在回归问题中，训练集的构造和样本数的确定是非常重要的。本文阐述了一种构造训练集的新思路。该思想的关键是在确定训练集之前选择超参数。更重要的是，提出了一种启发式的支持向量机(SVM)样本选择方法。利用这些方法，计算了给定置信水平上泛化误差与训练样本数量之间的关系。通过对基准数据(Boston Housing)和工程数据的实证研究表明，该方法可以为构建合适的训练集提供参考。此外，所提出的方法对其他参数学习机也具有实际意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 Fifth International Conference on Natural Computation

自引率

0.00%

发文量