使用 SMOTE 预测银行客户流失率：对比分析

Qeios Pub Date : 2024-03-18 DOI:10.32388/h82xtw

M. A. Hambali, Ishaku Andrew

{"title":"使用 SMOTE 预测银行客户流失率：对比分析","authors":"M. A. Hambali, Ishaku Andrew","doi":"10.32388/h82xtw","DOIUrl":null,"url":null,"abstract":"In today's market, customers have a plethora of options available to them when deciding where to invest their money. Consequently, customer churn and engagement have emerged as prominent concerns. With an increasing number of service providers targeting the same customer base, it is imperative for providers to understand evolving customer behavior and heightened expectations to retain their clientele. Numerous studies have addressed the issue of customer churn, with data mining frequently employed to predict bank customer attrition. While many researchers have proposed various approaches for predicting customer churn, some machine learning (ML) algorithms have struggled to deliver the required performance in identifying customer churn accurately most especially when the dataset is imbalance data. Therefore, this paper presents an application of Synthetic Minority Over Sampling Technique (SMOTE) on bank churn dataset. The SMOTE algorithm was employed to address the problem of data imbalance and Genetic Algorithm (GA) was applied to select most informative features from the original dataset. The selective features were evaluate using four (4) different classification algorithms: Random Forest (RF), K-Nearnest Neighbor (KNN), Artificial Neural Network (ANN) and Adaboost algorithms. The KNN model demonstrated superior performance compared to other models in terms of accuracy (96%), precision (96%), and F-measure (96%) respectively. Furthermore, we compared our results with existing models that utilized the same dataset, and our proposed strategy outperformed them.\n","PeriodicalId":500839,"journal":{"name":"Qeios","volume":"35 17","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bank Customer Churn Prediction Using SMOTE: A Comparative Analysis\",\"authors\":\"M. A. Hambali, Ishaku Andrew\",\"doi\":\"10.32388/h82xtw\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In today's market, customers have a plethora of options available to them when deciding where to invest their money. Consequently, customer churn and engagement have emerged as prominent concerns. With an increasing number of service providers targeting the same customer base, it is imperative for providers to understand evolving customer behavior and heightened expectations to retain their clientele. Numerous studies have addressed the issue of customer churn, with data mining frequently employed to predict bank customer attrition. While many researchers have proposed various approaches for predicting customer churn, some machine learning (ML) algorithms have struggled to deliver the required performance in identifying customer churn accurately most especially when the dataset is imbalance data. Therefore, this paper presents an application of Synthetic Minority Over Sampling Technique (SMOTE) on bank churn dataset. The SMOTE algorithm was employed to address the problem of data imbalance and Genetic Algorithm (GA) was applied to select most informative features from the original dataset. The selective features were evaluate using four (4) different classification algorithms: Random Forest (RF), K-Nearnest Neighbor (KNN), Artificial Neural Network (ANN) and Adaboost algorithms. The KNN model demonstrated superior performance compared to other models in terms of accuracy (96%), precision (96%), and F-measure (96%) respectively. Furthermore, we compared our results with existing models that utilized the same dataset, and our proposed strategy outperformed them.\\n\",\"PeriodicalId\":500839,\"journal\":{\"name\":\"Qeios\",\"volume\":\"35 17\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Qeios\",\"FirstCategoryId\":\"0\",\"ListUrlMain\":\"https://doi.org/10.32388/h82xtw\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Qeios","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.32388/h82xtw","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在当今市场上，客户在决定将资金投向何处时有大量的选择。因此，客户流失和参与度已成为突出的问题。随着越来越多的服务提供商瞄准同一客户群，服务提供商必须了解不断变化的客户行为和更高的期望，才能留住客户。许多研究都涉及客户流失问题，数据挖掘经常被用来预测银行客户流失。虽然许多研究人员提出了各种预测客户流失的方法，但一些机器学习（ML）算法在准确识别客户流失方面难以达到所需的性能，尤其是在数据集为不平衡数据的情况下。因此，本文介绍了合成少数群体过度采样技术（SMOTE）在银行客户流失数据集上的应用。本文采用 SMOTE 算法来解决数据不平衡的问题，并应用遗传算法（GA）从原始数据集中选择信息量最大的特征。选择出的特征使用四（4）种不同的分类算法进行评估：随机森林 (RF)、K-近邻 (KNN)、人工神经网络 (ANN) 和 Adaboost 算法。与其他模型相比，KNN 模型分别在准确率（96%）、精确率（96%）和 F-measure（96%）方面表现出色。此外，我们还将结果与使用相同数据集的现有模型进行了比较，结果发现我们提出的策略优于它们。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Bank Customer Churn Prediction Using SMOTE: A Comparative Analysis

In today's market, customers have a plethora of options available to them when deciding where to invest their money. Consequently, customer churn and engagement have emerged as prominent concerns. With an increasing number of service providers targeting the same customer base, it is imperative for providers to understand evolving customer behavior and heightened expectations to retain their clientele. Numerous studies have addressed the issue of customer churn, with data mining frequently employed to predict bank customer attrition. While many researchers have proposed various approaches for predicting customer churn, some machine learning (ML) algorithms have struggled to deliver the required performance in identifying customer churn accurately most especially when the dataset is imbalance data. Therefore, this paper presents an application of Synthetic Minority Over Sampling Technique (SMOTE) on bank churn dataset. The SMOTE algorithm was employed to address the problem of data imbalance and Genetic Algorithm (GA) was applied to select most informative features from the original dataset. The selective features were evaluate using four (4) different classification algorithms: Random Forest (RF), K-Nearnest Neighbor (KNN), Artificial Neural Network (ANN) and Adaboost algorithms. The KNN model demonstrated superior performance compared to other models in terms of accuracy (96%), precision (96%), and F-measure (96%) respectively. Furthermore, we compared our results with existing models that utilized the same dataset, and our proposed strategy outperformed them.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Qeios

自引率

0.00%

发文量