Solving Cross-Selling Problems with Ensemble Learning: A Case Study

X. Guo, Yilong Yin, Guang-Tong Zhou, Cailing Dong
{"title":"Solving Cross-Selling Problems with Ensemble Learning: A Case Study","authors":"X. Guo, Yilong Yin, Guang-Tong Zhou, Cailing Dong","doi":"10.1109/ICACTE.2008.86","DOIUrl":null,"url":null,"abstract":"This paper shows our solution to PAKDD Competition 2007 as a case study of cross-selling problems. Following a brief description of the data mining task, we discuss several difficulties to be confronted with in the task from the view of data mining. Then, we show how to do the data pre-processing. In the solution we proposed, to weaken class imbalance of the modeling dataset externally, we combine under-sampling and over-sampling techniques. Besides, we adjust the parameters of each base learner internally to solve cost-sensitivity. Next, we get an ensemble of base learners to achieve a better predicting performance. Experimental results on prediction dataset of real world provided by PAKDD Competition 2007 show that our solution is effective and efficient with its AUC value 60.73%.","PeriodicalId":364568,"journal":{"name":"2008 International Conference on Advanced Computer Theory and Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Conference on Advanced Computer Theory and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACTE.2008.86","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper shows our solution to PAKDD Competition 2007 as a case study of cross-selling problems. Following a brief description of the data mining task, we discuss several difficulties to be confronted with in the task from the view of data mining. Then, we show how to do the data pre-processing. In the solution we proposed, to weaken class imbalance of the modeling dataset externally, we combine under-sampling and over-sampling techniques. Besides, we adjust the parameters of each base learner internally to solve cost-sensitivity. Next, we get an ensemble of base learners to achieve a better predicting performance. Experimental results on prediction dataset of real world provided by PAKDD Competition 2007 show that our solution is effective and efficient with its AUC value 60.73%.
用集成学习解决交叉销售问题:一个案例研究
本文以交叉销售问题为例,展示了我们对2007年PAKDD竞赛的解决方案。在对数据挖掘任务进行简要描述之后,我们从数据挖掘的角度讨论了任务中面临的几个困难。然后,我们展示了如何进行数据预处理。在我们提出的解决方案中,为了从外部减弱建模数据集的类不平衡,我们结合了欠采样和过采样技术。此外,我们还对每个基学习器的参数进行了内部调整,以解决成本敏感性问题。接下来,我们得到一个基础学习器的集合,以获得更好的预测性能。在PAKDD Competition 2007提供的真实世界预测数据集上的实验结果表明,该方法的AUC值为60.73%,是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信