Identifying Optimal Parameters And Their Impact For Predicting Credit Card Defaulters Using Machine-Learning Algorithms

Muhammad Qasim Idrees, Humaira Naeem, Muhammad Imran, Asma Batool, Nadia Tabassum
{"title":"Identifying Optimal Parameters And Their Impact For Predicting Credit Card Defaulters Using Machine-Learning Algorithms","authors":"Muhammad Qasim Idrees, Humaira Naeem, Muhammad Imran, Asma Batool, Nadia Tabassum","doi":"10.54692/lgurjcsit.2022.0601260","DOIUrl":null,"url":null,"abstract":"Data mining and Machine learning are the emerging technologies that are rapidly spreading in every field of life due to their beneficial aspects. The financial sector also makes use of these technologies. Many research studies regarding banking data analysis have been performed using machine learning techniques. These research studies also have many Problems as the main focus of these studies was to achieve high accuracy and some of them only perform comparative analysis of different classifier's performance. Another major drawback of these studies was that they do not identify any optimal parameters and their impact. In this research, we have identified optimal parameters. These parameters are valuable for performing the credit scoring process and might also be used to predict credit card defaulters. We also find their impact on the results. We have used feature selection and classification techniques to identify optimal parameters and their impact on credit card defaulters identification. We have introduced three classifiers which are Kstar, SMO and Multilayer perceptron and repeat the process of classification and feature selection for every classifier. First, we apply feature selection techniques to our dataset with each classifier to find out possible optimal parameters and In the next phase, we use classification to find the impact of possible optimal parameters and proved our findings. In each round of classification, we have used different parameters available in the dataset every time we include and exclude some parameters and noted the results of each run of classification with each classifier and in this way, we identify the optimal parameters and their impact on the results Whereas we also analyze the performance of classifiers. To perform this research study, we use the “credit card defaults” dataset which we obtained from UCI Machine learning online repository. We use two feature selection techniques that include ranker approach and evolutionary search method and after that, we also apply classification techniques on the dataset. This research can help to reduce the complexities of the credit scoring process. Through this study, we identify up to six optimal parameters and also find their impact on the performance of classifiers. Further We also identify that multilayer perceptron was the best performing classifier out of three. This research work can also be extended to other fields in the future where we use this mechanism to find out optimal parameters and their impact can help us to predict the  results.  \n ","PeriodicalId":197260,"journal":{"name":"Lahore Garrison University Research Journal of Computer Science and Information Technology","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lahore Garrison University Research Journal of Computer Science and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54692/lgurjcsit.2022.0601260","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Data mining and Machine learning are the emerging technologies that are rapidly spreading in every field of life due to their beneficial aspects. The financial sector also makes use of these technologies. Many research studies regarding banking data analysis have been performed using machine learning techniques. These research studies also have many Problems as the main focus of these studies was to achieve high accuracy and some of them only perform comparative analysis of different classifier's performance. Another major drawback of these studies was that they do not identify any optimal parameters and their impact. In this research, we have identified optimal parameters. These parameters are valuable for performing the credit scoring process and might also be used to predict credit card defaulters. We also find their impact on the results. We have used feature selection and classification techniques to identify optimal parameters and their impact on credit card defaulters identification. We have introduced three classifiers which are Kstar, SMO and Multilayer perceptron and repeat the process of classification and feature selection for every classifier. First, we apply feature selection techniques to our dataset with each classifier to find out possible optimal parameters and In the next phase, we use classification to find the impact of possible optimal parameters and proved our findings. In each round of classification, we have used different parameters available in the dataset every time we include and exclude some parameters and noted the results of each run of classification with each classifier and in this way, we identify the optimal parameters and their impact on the results Whereas we also analyze the performance of classifiers. To perform this research study, we use the “credit card defaults” dataset which we obtained from UCI Machine learning online repository. We use two feature selection techniques that include ranker approach and evolutionary search method and after that, we also apply classification techniques on the dataset. This research can help to reduce the complexities of the credit scoring process. Through this study, we identify up to six optimal parameters and also find their impact on the performance of classifiers. Further We also identify that multilayer perceptron was the best performing classifier out of three. This research work can also be extended to other fields in the future where we use this mechanism to find out optimal parameters and their impact can help us to predict the  results.   
使用机器学习算法识别最优参数及其对预测信用卡违约者的影响
数据挖掘和机器学习是新兴技术,由于其有益的方面,在生活的各个领域迅速传播。金融部门也在利用这些技术。许多关于银行数据分析的研究都是使用机器学习技术进行的。这些研究也存在很多问题,主要是为了达到较高的准确率,有些研究只是对不同分类器的性能进行比较分析。这些研究的另一个主要缺点是它们没有确定任何最佳参数及其影响。在本研究中,我们确定了最优参数。这些参数对于执行信用评分过程很有价值,也可以用于预测信用卡违约者。我们还发现了它们对结果的影响。我们使用特征选择和分类技术来识别最佳参数及其对信用卡违约者识别的影响。我们引入了Kstar、SMO和多层感知器三种分类器,并对每个分类器重复分类和特征选择的过程。首先,我们对每个分类器的数据集应用特征选择技术来找到可能的最优参数。在下一阶段,我们使用分类来找到可能的最优参数的影响并证明我们的发现。在每一轮分类中,我们使用数据集中可用的不同参数,每次我们包括和排除一些参数,并注意每个分类器每次运行的分类结果,通过这种方式,我们确定了最优参数及其对结果的影响,同时我们还分析了分类器的性能。为了进行这项研究,我们使用了从UCI机器学习在线存储库中获得的“信用卡默认值”数据集。我们使用了两种特征选择技术,包括排名方法和进化搜索方法,之后我们还在数据集上应用了分类技术。这项研究可以帮助减少信用评分过程的复杂性。通过这项研究,我们确定了多达六个最优参数,并发现了它们对分类器性能的影响。此外,我们还确定多层感知器是三个分类器中表现最好的。这项研究工作也可以在未来扩展到其他领域,我们利用这种机制找到最优参数,它们的影响可以帮助我们预测结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信