Predicting customer churn using machine learning: A case study in the software industry

IF 4 Q2 BUSINESS
João Rolim Dias, Nuno Antonio
{"title":"Predicting customer churn using machine learning: A case study in the software industry","authors":"João Rolim Dias, Nuno Antonio","doi":"10.1057/s41270-023-00269-9","DOIUrl":null,"url":null,"abstract":"<p>Customer churn can be defined as the phenomenon of customers who discontinue their relationship with a company. This problem is transversal to many industries, including the software industry. This study uses Machine Learning to build a predictive model to identify potential churners in a Portuguese software house. Six popular Machine Learning models: Random Forest, AdaBoost, Gradient Boosting Machine, Multilayer Perceptron Classifier, XGBoost, and Logistic Regression, were developed to assess which one would have a better performance. The experimental results show that boosting techniques such as XGBoost present the best predictive performance. The XGBoost model presents a Recall of 0.85 and a ROC AUC of 0.86. Additionally to the model performance, the study of the model's feature importance revealed that some factors, such as the time to solve a support ticket, the type of application, the license age, and the number of incidents, significantly influence customer churn. These insights can help the software industry key drivers of churn and prioritize retention efforts accordingly.</p>","PeriodicalId":43041,"journal":{"name":"Journal of Marketing Analytics","volume":"18 11","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2023-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Marketing Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1057/s41270-023-00269-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BUSINESS","Score":null,"Total":0}
引用次数: 0

Abstract

Customer churn can be defined as the phenomenon of customers who discontinue their relationship with a company. This problem is transversal to many industries, including the software industry. This study uses Machine Learning to build a predictive model to identify potential churners in a Portuguese software house. Six popular Machine Learning models: Random Forest, AdaBoost, Gradient Boosting Machine, Multilayer Perceptron Classifier, XGBoost, and Logistic Regression, were developed to assess which one would have a better performance. The experimental results show that boosting techniques such as XGBoost present the best predictive performance. The XGBoost model presents a Recall of 0.85 and a ROC AUC of 0.86. Additionally to the model performance, the study of the model's feature importance revealed that some factors, such as the time to solve a support ticket, the type of application, the license age, and the number of incidents, significantly influence customer churn. These insights can help the software industry key drivers of churn and prioritize retention efforts accordingly.

Abstract Image

使用机器学习预测客户流失:软件行业的案例研究
客户流失可以定义为客户终止与公司关系的现象。这个问题对许多行业都是横向的,包括软件行业。本研究使用机器学习建立一个预测模型,以识别葡萄牙软件公司的潜在流失。开发了六种流行的机器学习模型:随机森林,AdaBoost,梯度增强机,多层感知器分类器,XGBoost和逻辑回归,以评估哪一种模型具有更好的性能。实验结果表明,XGBoost等增强技术具有最佳的预测性能。XGBoost模型的召回率为0.85,ROC AUC为0.86。除了模型性能之外,对模型特征重要性的研究表明,一些因素,如解决支持票据的时间,应用程序的类型,许可年龄和事件数量,显着影响客户流失。这些见解可以帮助软件行业的关键驱动因素,并相应地优先考虑保留工作。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.40
自引率
16.70%
发文量
46
期刊介绍: Data has become the new ore in today’s knowledge economy. However, merely storing and reporting are not enough to thrive in today’s increasingly competitive markets. What is called for is the ability to make sense of all these oceans of data, and to apply those insights to the way companies approach their markets, adjust to changing market conditions, and respond to new competitors. Marketing analytics lies at the heart of this contemporary wave of data driven decision-making. Companies can no longer survive when they rely on gut instinct to make decisions. Strategic leverage of data is one of the few remaining sources of sustainable competitive advantage. New products can be copied faster than ever before. Staff are becoming less loyal as well as more mobile, and business centers themselves are moving across the globe in a world that is getting flatter and flatter. The Journal of Marketing Analytics brings together applied research and practice papers in this blossoming field. A unique blend of applied academic research, combined with insights from commercial best practices makes the Journal of Marketing Analytics a perfect companion for academics and practitioners alike. Academics can stay in touch with the latest developments in this field. Marketing analytics professionals can read about the latest trends, and cutting edge academic research in this discipline. The Journal of Marketing Analytics will feature applied research papers on topics like targeting, segmentation, big data, customer loyalty and lifecycle management, cross-selling, CRM, data quality management, multi-channel marketing, and marketing strategy. The Journal of Marketing Analytics aims to combine the rigor of carefully controlled scientific research methods with applicability of real world case studies. Our double blind review process ensures that papers are selected on their content and merits alone, selecting the best possible papers in this field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信