SPY: A Novel Resampling Method for Improving Classification Performance in Imbalanced Data

2015 Seventh International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2015-10-01 DOI:10.1109/KSE.2015.24

Xuan Tho Dang, D. Tran, Osamu Hirose, K. Satou

引用次数: 9

Abstract

In recent years, imbalanced class datasets have caused many difficulties influencing on the analysis and understanding of raw data, which support decision-making process in many domains, especially in biomedical data classifications. Although there were a few approaches achieving promising results in applying class imbalance learning methods, this issue has still not solved completely and successfully yet by the existing methods. SMOTE is a famous and general over-sampling method addressing this problem, however, in some cases it cannot improve or sometimes reduces classification performance. Therefore, we developed a novel method named SPY. Experimental results on five imbalanced benchmark datasets from the UCI Machine Learning Repository showed that our method achieved better sensitivity and G-mean values than the control method (i.e., no over-sampling), SMOTE, and several successors of modified SMOTE including safe-level-SMOTE, safe-SMOTE, and borderline-SMOTE.

查看原文本刊更多论文

SPY:一种提高不平衡数据分类性能的重采样方法

近年来，类数据集的不平衡给原始数据的分析和理解带来了许多困难，这些困难影响了许多领域的决策过程，特别是生物医学数据分类。虽然有一些方法在应用班级不平衡学习方法方面取得了不错的效果，但现有的方法仍然没有完全成功地解决这个问题。SMOTE是解决这个问题的一种著名的通用过采样方法，然而，在某些情况下，它不能提高甚至有时会降低分类性能。因此，我们开发了一种名为SPY的新方法。在UCI机器学习存储库的5个不平衡基准数据集上的实验结果表明，我们的方法比控制方法(即无过采样)、SMOTE以及改进SMOTE的几个后续方法(包括安全级SMOTE、安全SMOTE和边界SMOTE)获得了更好的灵敏度和g均值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 Seventh International Conference on Knowledge and Systems Engineering (KSE)

自引率

0.00%

发文量