A method for predicting bank customer churn based on an ensemble machine learning model

A. Puchkov, Maxim I. Dli, M. Vasiľková, Nikolay N. Prokimnov
{"title":"A method for predicting bank customer churn based on an ensemble machine learning model","authors":"A. Puchkov, Maxim I. Dli, M. Vasiľková, Nikolay N. Prokimnov","doi":"10.37791/2687-0649-2024-19-1-5-27","DOIUrl":null,"url":null,"abstract":"The results of research are presented, the purpose of which was to develop a method for predicting the outflow of clients of a commercial bank based on the use of machine learning models (including deep artificial neural networks) for processing client data, as well as the creation of software tools that implement this method. The object of the study is a commercial bank, and the subject of the study is its activities in the B2C segment, which includes commercial interaction between businesses and individuals. The relevance of the chosen area of research is determined by the increased activity of banks in the field of introducing digital services to reduce non-operating costs associated, in particular, with retaining clients, since the costs of attracting new ones are much higher than maintaining existing clients. The scientific novelty of the research results is the developed method for predicting the outflow of commercial bank clients, as well as the algorithm underlying the software that implements the proposed method. The proposed ensemble forecasting model is based on three classification algorithms: k-means, random forest and multilayer perceptron. To aggregate the outputs of individual models, it is proposed to use a learning tree of fuzzy inference systems of the Mamdani type. Training of the ensemble model is carried out in two stages: first, the listed three classifiers are trained, and then, based on the data obtained from their outputs, a tree of fuzzy inference systems is trained. The ensemble model in the proposed method implements a static version of the forecast, the results of which are used in a dynamic forecast performed in two versions – based on the recurrent least squares method and based on a convolutional neural network. Model experiments carried out on a synthetic dataset taken from the Kaggle website showed that the ensemble model has a higher quality of binary classification than each model individually.","PeriodicalId":513599,"journal":{"name":"Journal Of Applied Informatics","volume":"239 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal Of Applied Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37791/2687-0649-2024-19-1-5-27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The results of research are presented, the purpose of which was to develop a method for predicting the outflow of clients of a commercial bank based on the use of machine learning models (including deep artificial neural networks) for processing client data, as well as the creation of software tools that implement this method. The object of the study is a commercial bank, and the subject of the study is its activities in the B2C segment, which includes commercial interaction between businesses and individuals. The relevance of the chosen area of research is determined by the increased activity of banks in the field of introducing digital services to reduce non-operating costs associated, in particular, with retaining clients, since the costs of attracting new ones are much higher than maintaining existing clients. The scientific novelty of the research results is the developed method for predicting the outflow of commercial bank clients, as well as the algorithm underlying the software that implements the proposed method. The proposed ensemble forecasting model is based on three classification algorithms: k-means, random forest and multilayer perceptron. To aggregate the outputs of individual models, it is proposed to use a learning tree of fuzzy inference systems of the Mamdani type. Training of the ensemble model is carried out in two stages: first, the listed three classifiers are trained, and then, based on the data obtained from their outputs, a tree of fuzzy inference systems is trained. The ensemble model in the proposed method implements a static version of the forecast, the results of which are used in a dynamic forecast performed in two versions – based on the recurrent least squares method and based on a convolutional neural network. Model experiments carried out on a synthetic dataset taken from the Kaggle website showed that the ensemble model has a higher quality of binary classification than each model individually.
基于集合机器学习模型的银行客户流失预测方法
本文介绍了研究成果,其目的是在使用机器学习模型(包括深度人工神经网络)处理客户数据的基础上,开发一种预测商业银行客户外流的方法,并创建实现该方法的软件工具。研究对象是一家商业银行,研究主题是其在 B2C 领域的活动,包括企业与个人之间的商业互动。所选研究领域的相关性取决于银行在引入数字服务以降低非运营成本(尤其是与留住客户相关的成本)方面的活动日益增多,因为吸引新客户的成本远高于维护现有客户的成本。研究成果的科学新颖性在于所开发的预测商业银行客户外流的方法,以及实现该方法的软件的基础算法。所提出的集合预测模型基于三种分类算法:k-means、随机森林和多层感知器。为了汇总各个模型的输出结果,建议使用马姆达尼型模糊推理系统的学习树。集合模型的训练分两个阶段进行:首先,对列出的三个分类器进行训练,然后,根据从它们的输出中获得的数据,训练一棵模糊推理系统树。建议方法中的集合模型实现了静态预测,其结果用于动态预测,动态预测有两个版本--基于递归最小二乘法和基于卷积神经网络。在 Kaggle 网站的合成数据集上进行的模型实验表明,集合模型的二元分类质量高于每个单独模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信