Remigio Ismael Hurtado Ortiz, Edisson Salinas Jara, Juan Hurtado Ortiz, Johnny Maisincho Panjón
{"title":"A data analytics method based on data science and machine learning for bank risk prediction in credit applications for financial institutions","authors":"Remigio Ismael Hurtado Ortiz, Edisson Salinas Jara, Juan Hurtado Ortiz, Johnny Maisincho Panjón","doi":"10.1109/ROPEC55836.2022.10018807","DOIUrl":null,"url":null,"abstract":"Nowadays, banks grant credits so that customers can acquire a good or service, start or improve a business, among other benefits. The problems that may arise are over-indebtedness and low saving possibilities on the part of customers, so the tendency is the risk of default. Financial institutions require tools for default risk analysis and problem prediction. Therefore, in this research, a data analysis method based on data science and machine learning is proposed for bank risk prediction in credit applications for financial institutions. For the analysis process and for the prediction of a credit, predictive analysis methods are used: Genetic Algorithms (GA), Random Forest (RF), K-Nearest-Neighbor (KNN), Support Vector Machines (SVM) and Neural Network (NN). Quality metrics such as Accuracy, Precision, Recall and F1 Score are used to evaluate the results. A public dataset called Statlog [1] is used. This work opens the door for data analysis in different banking services. The main objective of this research is to help financial companies to optimize their processes.","PeriodicalId":237392,"journal":{"name":"2022 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROPEC55836.2022.10018807","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Nowadays, banks grant credits so that customers can acquire a good or service, start or improve a business, among other benefits. The problems that may arise are over-indebtedness and low saving possibilities on the part of customers, so the tendency is the risk of default. Financial institutions require tools for default risk analysis and problem prediction. Therefore, in this research, a data analysis method based on data science and machine learning is proposed for bank risk prediction in credit applications for financial institutions. For the analysis process and for the prediction of a credit, predictive analysis methods are used: Genetic Algorithms (GA), Random Forest (RF), K-Nearest-Neighbor (KNN), Support Vector Machines (SVM) and Neural Network (NN). Quality metrics such as Accuracy, Precision, Recall and F1 Score are used to evaluate the results. A public dataset called Statlog [1] is used. This work opens the door for data analysis in different banking services. The main objective of this research is to help financial companies to optimize their processes.
如今,银行发放信贷,使客户能够获得商品或服务,开办或改进业务,以及其他好处。可能出现的问题是客户的过度负债和低储蓄可能性,因此趋势是违约风险。金融机构需要违约风险分析和问题预测工具。因此,本研究提出了一种基于数据科学和机器学习的数据分析方法,用于金融机构信贷申请中的银行风险预测。对于信用的分析过程和预测,使用了预测分析方法:遗传算法(GA),随机森林(RF), k -近邻(KNN),支持向量机(SVM)和神经网络(NN)。准确性、精密度、召回率和F1分数等质量指标用于评估结果。使用一个名为Statlog[1]的公共数据集。这项工作为不同银行服务的数据分析打开了大门。本研究的主要目的是帮助金融公司优化其流程。