On Validation Setup for Multiclass Imbalanced Data Sets

2016 5th Brazilian Conference on Intelligent Systems (BRACIS) Pub Date : 2016-10-01 DOI:10.1109/BRACIS.2016.090

Evandro J. R. Silva, C. Zanchettin

引用次数: 0

Abstract

The validation of experiments is commonly evaluated with Cross-Validation methods. In the literature the 10-fold, followed by bootstrap, are the most indicated methods. However there lacks a study of a proper validation procedure for imbalanced data sets, specially for the rare class case. In this work the most used validation methods were tested in ten imbalanced data sets, with a generic and an ad hoc classifiers. Analyses showed that 10-fold, followed by hold-out, are the indicated methods when using a generic classifier. For the ad hoc classifier the 10-fold, followed by bootstrap, are the indicated ones. In the case of rare classes in a data set, the most indicated method is the repeated hold-out.

查看原文本刊更多论文

多类不平衡数据集的验证设置

实验的验证通常用交叉验证方法进行评估。在文献中，10倍，其次是bootstrap，是最常用的方法。然而，缺乏对不平衡数据集的适当验证程序的研究，特别是对于罕见的类情况。在这项工作中，最常用的验证方法在十个不平衡数据集上进行了测试，使用了通用和特别分类器。分析表明，当使用通用分类器时，10倍，其次是保留，是指示的方法。对于特别分类器，10倍，然后是bootstrap，是指示的。在数据集中类很少的情况下，最常用的方法是重复保留。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 5th Brazilian Conference on Intelligent Systems (BRACIS)

自引率

0.00%

发文量