Solving Cross-Selling Problems with Ensemble Learning: A Case Study

2008 International Conference on Advanced Computer Theory and Engineering Pub Date : 2008-12-20 DOI:10.1109/ICACTE.2008.86

X. Guo, Yilong Yin, Guang-Tong Zhou, Cailing Dong

引用次数: 0

Abstract

This paper shows our solution to PAKDD Competition 2007 as a case study of cross-selling problems. Following a brief description of the data mining task, we discuss several difficulties to be confronted with in the task from the view of data mining. Then, we show how to do the data pre-processing. In the solution we proposed, to weaken class imbalance of the modeling dataset externally, we combine under-sampling and over-sampling techniques. Besides, we adjust the parameters of each base learner internally to solve cost-sensitivity. Next, we get an ensemble of base learners to achieve a better predicting performance. Experimental results on prediction dataset of real world provided by PAKDD Competition 2007 show that our solution is effective and efficient with its AUC value 60.73%.

查看原文本刊更多论文

用集成学习解决交叉销售问题:一个案例研究

本文以交叉销售问题为例，展示了我们对2007年PAKDD竞赛的解决方案。在对数据挖掘任务进行简要描述之后，我们从数据挖掘的角度讨论了任务中面临的几个困难。然后，我们展示了如何进行数据预处理。在我们提出的解决方案中，为了从外部减弱建模数据集的类不平衡，我们结合了欠采样和过采样技术。此外，我们还对每个基学习器的参数进行了内部调整，以解决成本敏感性问题。接下来，我们得到一个基础学习器的集合，以获得更好的预测性能。在PAKDD Competition 2007提供的真实世界预测数据集上的实验结果表明，该方法的AUC值为60.73%，是有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2008 International Conference on Advanced Computer Theory and Engineering

自引率

0.00%

发文量