Research on Bank Marketing Behavior Based on Machine Learning

Deli Wang
{"title":"Research on Bank Marketing Behavior Based on Machine Learning","authors":"Deli Wang","doi":"10.1145/3421766.3421800","DOIUrl":null,"url":null,"abstract":"At present, under the background that data mining technology is becoming more mature and widely used in various fields, and due to the advent of the customer-oriented era and increased competition from banks, data mining technology is being widely used in the field of banking and finance to determine the target customer group And promote bank sales. Therefore, based on the Bank Marketing data in the UCI Machine Learning Repository database, this article uses the C5.0 algorithm to classify customers on the clementine experimental platform, and proposes corresponding suggestions for bank marketing based on the classification results. This article first explores and understands the Bank Marketing data set, and describes the distribution of the customer background in the data set. The quality of the data set was further explored, and the outliers and outliers were corrected by replacing them with normal data that were closest to the outliers or extreme values. This paper further selects the optimal feature variable. First, use the Filter node to filter the unimportant variables of the classification, and further select one of the more relevant variables to reduce the redundancy of the variables. The final variables are: previous, age, duration, outcome, contact, housing, job, loan, marital, education. Secondly, this paper uses sampling nodes to perform undersampling to balance the data set. On this basis, the C5.0 algorithm is used to establish a classification model and optimize parameters, and finally obtain eight classification rules. Based on this, suggestions are provided for target group determination. Finally, this article introduces the remaining four classification algorithms: C&T, QUEST, CHAID, Neural Networks, and compares the C5.0 algorithm with the four classification algorithms based on the balanced data set. It is concluded that several algorithms have certain differences and the overall prediction accuracy is good. This article combines data mining theory with practical problems of banking business, and establishes a bank target customer classification model based on C5.0 algorithm. The obtained classification rules can effectively help banks to divide customer groups and take targeted measures to improve marketing efficiency.","PeriodicalId":360184,"journal":{"name":"Proceedings of the 2nd International Conference on Artificial Intelligence and Advanced Manufacture","volume":"148 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Conference on Artificial Intelligence and Advanced Manufacture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3421766.3421800","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

At present, under the background that data mining technology is becoming more mature and widely used in various fields, and due to the advent of the customer-oriented era and increased competition from banks, data mining technology is being widely used in the field of banking and finance to determine the target customer group And promote bank sales. Therefore, based on the Bank Marketing data in the UCI Machine Learning Repository database, this article uses the C5.0 algorithm to classify customers on the clementine experimental platform, and proposes corresponding suggestions for bank marketing based on the classification results. This article first explores and understands the Bank Marketing data set, and describes the distribution of the customer background in the data set. The quality of the data set was further explored, and the outliers and outliers were corrected by replacing them with normal data that were closest to the outliers or extreme values. This paper further selects the optimal feature variable. First, use the Filter node to filter the unimportant variables of the classification, and further select one of the more relevant variables to reduce the redundancy of the variables. The final variables are: previous, age, duration, outcome, contact, housing, job, loan, marital, education. Secondly, this paper uses sampling nodes to perform undersampling to balance the data set. On this basis, the C5.0 algorithm is used to establish a classification model and optimize parameters, and finally obtain eight classification rules. Based on this, suggestions are provided for target group determination. Finally, this article introduces the remaining four classification algorithms: C&T, QUEST, CHAID, Neural Networks, and compares the C5.0 algorithm with the four classification algorithms based on the balanced data set. It is concluded that several algorithms have certain differences and the overall prediction accuracy is good. This article combines data mining theory with practical problems of banking business, and establishes a bank target customer classification model based on C5.0 algorithm. The obtained classification rules can effectively help banks to divide customer groups and take targeted measures to improve marketing efficiency.
基于机器学习的银行营销行为研究
目前,在数据挖掘技术日益成熟并广泛应用于各个领域的背景下,由于客户导向时代的到来以及来自银行的竞争加剧,数据挖掘技术正在被广泛应用于银行金融领域,以确定目标客户群体,促进银行销售。因此,本文基于UCI Machine Learning Repository数据库中的Bank Marketing数据,在clementine实验平台上使用C5.0算法对客户进行分类,并根据分类结果对银行Marketing提出相应的建议。本文首先对Bank Marketing数据集进行了探索和理解,并描述了客户背景在数据集中的分布。进一步探索数据集的质量,并用最接近异常值或极值的正常数据替换异常值和异常值,对异常值和异常值进行校正。本文进一步选取最优特征变量。首先,使用Filter节点对分类中不重要的变量进行过滤,并进一步选择一个相关性较强的变量,以减少变量的冗余。最后的变量是:以前,年龄,持续时间,结果,联系,住房,工作,贷款,婚姻,教育。其次,利用采样节点进行欠采样,平衡数据集。在此基础上,利用C5.0算法建立分类模型并对参数进行优化,最终得到8条分类规则。在此基础上,对目标群体的确定提出了建议。最后,本文介绍了剩余的四种分类算法:C&T、QUEST、CHAID、Neural Networks,并将C5.0算法与基于平衡数据集的四种分类算法进行了比较。结果表明,几种算法存在一定的差异,总体预测精度较好。本文将数据挖掘理论与银行业务的实际问题相结合,建立了基于C5.0算法的银行目标客户分类模型。所得的分类规则可以有效地帮助银行对客户群体进行划分,并有针对性地采取措施,提高营销效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信