一种提高信用审批分类性能的机器学习框架

Pulung Hendro Prastyo, Septian Eko Prasetyo, S. Arti
{"title":"一种提高信用审批分类性能的机器学习框架","authors":"Pulung Hendro Prastyo, Septian Eko Prasetyo, S. Arti","doi":"10.14421/ijid.2021.2384","DOIUrl":null,"url":null,"abstract":"Credit scoring is a model commonly used in the decision-making process to refuse or accept loan requests. The credit score model depends on the type of loan or credit and is complemented by various credit factors. At present, there is no accurate model for determining which creditors are eligible for loans. Therefore, an accurate and automatic model is needed to make it easier for banks to determine appropriate creditors. To address the problem, we propose a new approach using the combination of a machine learning algorithm (Naïve Bayes), Information Gain (IG), and discretization in classifying creditors. This research work employed an experimental method using the Weka application. Australian Credit Approval data was used as a dataset, which contains 690 instances of data. In this study, Information Gain is employed as a feature selection to select relevant features so that the Naïve Bayes algorithm can work optimally. The confusion matrix is used as an evaluator and 10-fold cross-validation as a validator. Based on experimental results, our proposed method could improve the classification performance, which reached the highest performance in average accuracy, precision, recall, and f-measure with the value of 86.29%, 86.33%, 86.29%, 86.30%, and 91.52%, respectively. Besides, the proposed method also obtains 91.52% of the ROC area. It indicates that our proposed method can be classified as an excellent classification.","PeriodicalId":33558,"journal":{"name":"IJID International Journal on Informatics for Development","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Machine Learning Framework for Improving Classification Performance on Credit Approval\",\"authors\":\"Pulung Hendro Prastyo, Septian Eko Prasetyo, S. Arti\",\"doi\":\"10.14421/ijid.2021.2384\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Credit scoring is a model commonly used in the decision-making process to refuse or accept loan requests. The credit score model depends on the type of loan or credit and is complemented by various credit factors. At present, there is no accurate model for determining which creditors are eligible for loans. Therefore, an accurate and automatic model is needed to make it easier for banks to determine appropriate creditors. To address the problem, we propose a new approach using the combination of a machine learning algorithm (Naïve Bayes), Information Gain (IG), and discretization in classifying creditors. This research work employed an experimental method using the Weka application. Australian Credit Approval data was used as a dataset, which contains 690 instances of data. In this study, Information Gain is employed as a feature selection to select relevant features so that the Naïve Bayes algorithm can work optimally. The confusion matrix is used as an evaluator and 10-fold cross-validation as a validator. Based on experimental results, our proposed method could improve the classification performance, which reached the highest performance in average accuracy, precision, recall, and f-measure with the value of 86.29%, 86.33%, 86.29%, 86.30%, and 91.52%, respectively. Besides, the proposed method also obtains 91.52% of the ROC area. It indicates that our proposed method can be classified as an excellent classification.\",\"PeriodicalId\":33558,\"journal\":{\"name\":\"IJID International Journal on Informatics for Development\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IJID International Journal on Informatics for Development\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14421/ijid.2021.2384\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IJID International Journal on Informatics for Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14421/ijid.2021.2384","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

信用评分是决策过程中常用的一种模型,用于拒绝或接受贷款请求。信用评分模型取决于贷款或信用的类型,并辅以各种信用因素。目前,还没有准确的模型来确定哪些债权人有资格获得贷款。因此,需要一个准确和自动的模型,使银行更容易确定合适的债权人。为了解决这个问题,我们提出了一种结合机器学习算法(Naïve Bayes)、信息增益(IG)和离散化对债权人进行分类的新方法。本研究采用了Weka应用程序的实验方法。澳大利亚信贷审批数据被用作一个数据集,其中包含690个数据实例。在本研究中,采用Information Gain作为特征选择,选择相关特征,使Naïve贝叶斯算法能够最优地工作。混淆矩阵用作评估器,10倍交叉验证用作验证器。实验结果表明,本文提出的方法可以提高分类性能,在平均准确率、精密度、召回率和f-measure方面达到了最高的性能,分别为86.29%、86.33%、86.29%、86.30%和91.52%。此外,该方法也获得了91.52%的ROC面积。这表明我们提出的方法是一种很好的分类方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Machine Learning Framework for Improving Classification Performance on Credit Approval
Credit scoring is a model commonly used in the decision-making process to refuse or accept loan requests. The credit score model depends on the type of loan or credit and is complemented by various credit factors. At present, there is no accurate model for determining which creditors are eligible for loans. Therefore, an accurate and automatic model is needed to make it easier for banks to determine appropriate creditors. To address the problem, we propose a new approach using the combination of a machine learning algorithm (Naïve Bayes), Information Gain (IG), and discretization in classifying creditors. This research work employed an experimental method using the Weka application. Australian Credit Approval data was used as a dataset, which contains 690 instances of data. In this study, Information Gain is employed as a feature selection to select relevant features so that the Naïve Bayes algorithm can work optimally. The confusion matrix is used as an evaluator and 10-fold cross-validation as a validator. Based on experimental results, our proposed method could improve the classification performance, which reached the highest performance in average accuracy, precision, recall, and f-measure with the value of 86.29%, 86.33%, 86.29%, 86.30%, and 91.52%, respectively. Besides, the proposed method also obtains 91.52% of the ROC area. It indicates that our proposed method can be classified as an excellent classification.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
6
审稿时长
8 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信