Multilayer Perceptron Optimization on Imbalanced Data Using SVM-SMOTE and One-Hot Encoding for Credit Card Default Prediction

Adi Sakti Almajid
{"title":"Multilayer Perceptron Optimization on Imbalanced Data Using SVM-SMOTE and One-Hot Encoding for Credit Card Default Prediction","authors":"Adi Sakti Almajid","doi":"10.15294/jaist.v3i2.57061","DOIUrl":null,"url":null,"abstract":"Credit risk assessment analysis by classifying potential users is an important process to reduce the occurrence of default users. The problems faced from the classification process using real-world datasets are imbalanced data that causes bias-to-majority in model training outcomes. These problems cause the algorithm to only focus on the majority class and ignore the minority class, even though both classes have the same important role. To overcome this problem, a combination of One-hot encoding (OHE) and SVM-Synthetic minority oversampling technique (SVM-SMOTE) techniques are used for the optimization process of the MLP classification algorithm. OHE is used to encode values categorical nominal and SVM-SMOTE for the oversampling. The results of the measurement of the ability of the model generated from the optimized MLP are then compared with the baseline using the AUC score. The data used is the default of credit card client dataset from Taiwan which has 30000 instances. The result of the highest AUC score of the MLP that has gone through optimization is 0.7184, an increase of 0.2179 compared to the baseline.","PeriodicalId":418742,"journal":{"name":"Journal of Advances in Information Systems and Technology","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advances in Information Systems and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15294/jaist.v3i2.57061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Credit risk assessment analysis by classifying potential users is an important process to reduce the occurrence of default users. The problems faced from the classification process using real-world datasets are imbalanced data that causes bias-to-majority in model training outcomes. These problems cause the algorithm to only focus on the majority class and ignore the minority class, even though both classes have the same important role. To overcome this problem, a combination of One-hot encoding (OHE) and SVM-Synthetic minority oversampling technique (SVM-SMOTE) techniques are used for the optimization process of the MLP classification algorithm. OHE is used to encode values categorical nominal and SVM-SMOTE for the oversampling. The results of the measurement of the ability of the model generated from the optimized MLP are then compared with the baseline using the AUC score. The data used is the default of credit card client dataset from Taiwan which has 30000 instances. The result of the highest AUC score of the MLP that has gone through optimization is 0.7184, an increase of 0.2179 compared to the baseline.
基于SVM-SMOTE和One-Hot编码的不平衡数据多层感知器优化信用卡违约预测
对潜在用户进行分类进行信用风险评估分析是减少违约用户发生的重要过程。使用真实世界数据集的分类过程所面临的问题是数据不平衡,导致模型训练结果中的偏多数。这些问题导致算法只关注多数类而忽略少数类,尽管这两个类具有同样重要的作用。为了克服这一问题,将One-hot encoding (OHE)和SVM-Synthetic minority oversampling technique (SVM-SMOTE)技术相结合,对MLP分类算法进行优化。OHE用于对过采样的分类标称值和SVM-SMOTE进行编码。然后使用AUC分数将由优化的MLP生成的模型的能力测量结果与基线进行比较。使用的数据是来自台湾的信用卡客户端数据集的默认值,该数据集有30000个实例。优化后的MLP最高AUC得分为0.7184,较基线提高0.2179。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信