Investigating the impact of undersampling and bagging: an empirical investigation for customer attrition modeling

IF 4.4 3区 管理学 Q1 OPERATIONS RESEARCH & MANAGEMENT SCIENCE
Arno De Caigny, Kristof Coussement, Matthijs Meire, Steven Hoornaert
{"title":"Investigating the impact of undersampling and bagging: an empirical investigation for customer attrition modeling","authors":"Arno De Caigny,&nbsp;Kristof Coussement,&nbsp;Matthijs Meire,&nbsp;Steven Hoornaert","doi":"10.1007/s10479-025-06516-9","DOIUrl":null,"url":null,"abstract":"<div><p>Given the growing interest in using AI and analytics to support CRM decision making, we discuss why undersampling and bagging are popular prediction techniques in customer churn prediction (CCP). The former helps in tackling the class imbalance problem and the latter improves model stability. However, extant CCP literature is unclear on the impact of undersampling on model stability and predictive performance, while bagging has difficulties in handling the class imbalance problem. Therefore, we extend existing CCP research to benchmark underbagging, which combines undersampling and bagging. Having both prediction techniques combined we recuperate customer data that would have been lost in undersampling by using them in multiple bags and passing an undersampled, more balanced training set to the classifier. In an extensive experiment including 11 real-life CCP datasets, underbagging is benchmarked against its constituents and other popular CCP classifiers in terms of predictive performance, profit and operational efficiency. Our results indicate that underbagging is a valid and reliable alternative framework for CCP prediction.</p></div>","PeriodicalId":8215,"journal":{"name":"Annals of Operations Research","volume":"346 3","pages":"2401 - 2421"},"PeriodicalIF":4.4000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Operations Research","FirstCategoryId":"91","ListUrlMain":"https://link.springer.com/article/10.1007/s10479-025-06516-9","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPERATIONS RESEARCH & MANAGEMENT SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Given the growing interest in using AI and analytics to support CRM decision making, we discuss why undersampling and bagging are popular prediction techniques in customer churn prediction (CCP). The former helps in tackling the class imbalance problem and the latter improves model stability. However, extant CCP literature is unclear on the impact of undersampling on model stability and predictive performance, while bagging has difficulties in handling the class imbalance problem. Therefore, we extend existing CCP research to benchmark underbagging, which combines undersampling and bagging. Having both prediction techniques combined we recuperate customer data that would have been lost in undersampling by using them in multiple bags and passing an undersampled, more balanced training set to the classifier. In an extensive experiment including 11 real-life CCP datasets, underbagging is benchmarked against its constituents and other popular CCP classifiers in terms of predictive performance, profit and operational efficiency. Our results indicate that underbagging is a valid and reliable alternative framework for CCP prediction.

Abstract Image

调查抽样不足和装袋的影响:客户流失模型的实证调查
鉴于人们对使用人工智能和分析来支持CRM决策的兴趣日益浓厚,我们讨论了为什么欠采样和装袋是客户流失预测(CCP)中流行的预测技术。前者有助于解决类不平衡问题,后者提高了模型的稳定性。然而,现有CCP文献对欠采样对模型稳定性和预测性能的影响尚不清楚,而套袋在处理类别不平衡问题方面存在困难。因此,我们将现有的CCP研究扩展到将欠采样和装袋相结合的基准欠装袋。将这两种预测技术结合起来,我们通过在多个袋子中使用客户数据,并将欠采样、更平衡的训练集传递给分类器,从而恢复可能在欠采样中丢失的客户数据。在包括11个真实CCP数据集的广泛实验中,underbagging在预测性能、利润和运营效率方面与其成分和其他流行的CCP分类器进行了基准测试。我们的研究结果表明,underbagging是一个有效和可靠的CCP预测框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Annals of Operations Research
Annals of Operations Research 管理科学-运筹学与管理科学
CiteScore
7.90
自引率
16.70%
发文量
596
审稿时长
8.4 months
期刊介绍: The Annals of Operations Research publishes peer-reviewed original articles dealing with key aspects of operations research, including theory, practice, and computation. The journal publishes full-length research articles, short notes, expositions and surveys, reports on computational studies, and case studies that present new and innovative practical applications. In addition to regular issues, the journal publishes periodic special volumes that focus on defined fields of operations research, ranging from the highly theoretical to the algorithmic and the applied. These volumes have one or more Guest Editors who are responsible for collecting the papers and overseeing the refereeing process.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信