{"title":"Enhancing customer retention with machine learning: A comparative analysis of ensemble models for accurate churn prediction","authors":"Payam Boozary , Sogand Sheykhan , Hamed GhorbanTanhaei , Cosimo Magazzino","doi":"10.1016/j.jjimei.2025.100331","DOIUrl":null,"url":null,"abstract":"<div><div>This paper investigates the use of machine learning models for customer churn prediction, focusing on the comparative effectiveness of ensemble approaches such as XGBoost and Random Forest with classical classifiers. The study evaluates the benefits and shortcomings of each strategy in dealing with complicated datasets by analyzing confusion matrices and Receiver Operating Characteristic (ROC) curves in detail. Ensemble models outperformed on key criteria such as accuracy, precision, recall, and F1 scores, yielding excellent results. These results demonstrate the effectiveness of ensemble approaches in producing accurate and trustworthy forecasts, making them suitable for client retention efforts. The report offers practical insights for firms looking to use sophisticated machine learning approaches to make better strategic decisions and retain more customers.</div></div>","PeriodicalId":100699,"journal":{"name":"International Journal of Information Management Data Insights","volume":"5 1","pages":"Article 100331"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information Management Data Insights","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667096825000138","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper investigates the use of machine learning models for customer churn prediction, focusing on the comparative effectiveness of ensemble approaches such as XGBoost and Random Forest with classical classifiers. The study evaluates the benefits and shortcomings of each strategy in dealing with complicated datasets by analyzing confusion matrices and Receiver Operating Characteristic (ROC) curves in detail. Ensemble models outperformed on key criteria such as accuracy, precision, recall, and F1 scores, yielding excellent results. These results demonstrate the effectiveness of ensemble approaches in producing accurate and trustworthy forecasts, making them suitable for client retention efforts. The report offers practical insights for firms looking to use sophisticated machine learning approaches to make better strategic decisions and retain more customers.