{"title":"一种可解释的基于adasync的信用评估焦点损失方法","authors":"Shaukat Ali Shahee, Rujavi Patel","doi":"10.1002/for.3252","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>The integration of deep learning techniques with financial technology (fintech) has revolutionized the credit risk analysis, a critical component of financial risk management. A pervasive challenge in credit risk assessment lies in the skewed distribution of data, hindering accurate predictions, particularly for minority class instances. In available literature, various solutions have been proposed to address class imbalance, albeit with limitations. Focal loss is one of the well-known loss functions proposed for handling class imbalance by running the hyperparameter \n<span></span><math>\n <mi>γ</mi></math>. However, imbalance still remains in terms of number of hard-to-learn observations between the classes. In this paper, we have proposed integration of ADASYN with focal loss to mitigate class imbalance and enhance credit scoring accuracy. ADASYN systematically generates synthetic data based on hard-to-learn examples to counter skewed distributions, while focal loss prioritizes the training of challenging examples, fostering a more balanced model performance. This approach has been rigorously tested using real-world imbalanced datasets and credit assessment data, and the outcomes have been compared against a range of sample technique and loss function combinations. The results clearly show that our suggested strategy is better than other approaches. Although improving the accuracy of credit risk analysis is critical, model interpretability is just as important for enabling financial analysts to make wise choices. In order to solve this, we have measured the global and local contributions of each feature using SHAP (Shapley additive explanation). According to global interpretability, the top 4 parameters influencing credit risk assessment are checking account status, loan purpose, borrower age, credit history, and interest rate/installment rate. Moreover, local interpretability analysis reveals quantitative and direction differences in feature contributions. These revelations not only broaden our knowledge of credit assessment services but also highlight how important a role they could play in attracting new clients and generating income. This paper also highlights how the suggested approach may be scaled to other imbalanced real-world datasets, demonstrating how it can improve model performance in terms of AUC, G-mean, and F-measure.</p>\n </div>","PeriodicalId":47835,"journal":{"name":"Journal of Forecasting","volume":"44 4","pages":"1513-1530"},"PeriodicalIF":3.4000,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Explainable ADASYN-Based Focal Loss Approach for Credit Assessment\",\"authors\":\"Shaukat Ali Shahee, Rujavi Patel\",\"doi\":\"10.1002/for.3252\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>The integration of deep learning techniques with financial technology (fintech) has revolutionized the credit risk analysis, a critical component of financial risk management. A pervasive challenge in credit risk assessment lies in the skewed distribution of data, hindering accurate predictions, particularly for minority class instances. In available literature, various solutions have been proposed to address class imbalance, albeit with limitations. Focal loss is one of the well-known loss functions proposed for handling class imbalance by running the hyperparameter \\n<span></span><math>\\n <mi>γ</mi></math>. However, imbalance still remains in terms of number of hard-to-learn observations between the classes. In this paper, we have proposed integration of ADASYN with focal loss to mitigate class imbalance and enhance credit scoring accuracy. ADASYN systematically generates synthetic data based on hard-to-learn examples to counter skewed distributions, while focal loss prioritizes the training of challenging examples, fostering a more balanced model performance. This approach has been rigorously tested using real-world imbalanced datasets and credit assessment data, and the outcomes have been compared against a range of sample technique and loss function combinations. The results clearly show that our suggested strategy is better than other approaches. Although improving the accuracy of credit risk analysis is critical, model interpretability is just as important for enabling financial analysts to make wise choices. In order to solve this, we have measured the global and local contributions of each feature using SHAP (Shapley additive explanation). According to global interpretability, the top 4 parameters influencing credit risk assessment are checking account status, loan purpose, borrower age, credit history, and interest rate/installment rate. Moreover, local interpretability analysis reveals quantitative and direction differences in feature contributions. These revelations not only broaden our knowledge of credit assessment services but also highlight how important a role they could play in attracting new clients and generating income. This paper also highlights how the suggested approach may be scaled to other imbalanced real-world datasets, demonstrating how it can improve model performance in terms of AUC, G-mean, and F-measure.</p>\\n </div>\",\"PeriodicalId\":47835,\"journal\":{\"name\":\"Journal of Forecasting\",\"volume\":\"44 4\",\"pages\":\"1513-1530\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-01-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Forecasting\",\"FirstCategoryId\":\"96\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/for.3252\",\"RegionNum\":3,\"RegionCategory\":\"经济学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ECONOMICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Forecasting","FirstCategoryId":"96","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/for.3252","RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}
An Explainable ADASYN-Based Focal Loss Approach for Credit Assessment
The integration of deep learning techniques with financial technology (fintech) has revolutionized the credit risk analysis, a critical component of financial risk management. A pervasive challenge in credit risk assessment lies in the skewed distribution of data, hindering accurate predictions, particularly for minority class instances. In available literature, various solutions have been proposed to address class imbalance, albeit with limitations. Focal loss is one of the well-known loss functions proposed for handling class imbalance by running the hyperparameter
. However, imbalance still remains in terms of number of hard-to-learn observations between the classes. In this paper, we have proposed integration of ADASYN with focal loss to mitigate class imbalance and enhance credit scoring accuracy. ADASYN systematically generates synthetic data based on hard-to-learn examples to counter skewed distributions, while focal loss prioritizes the training of challenging examples, fostering a more balanced model performance. This approach has been rigorously tested using real-world imbalanced datasets and credit assessment data, and the outcomes have been compared against a range of sample technique and loss function combinations. The results clearly show that our suggested strategy is better than other approaches. Although improving the accuracy of credit risk analysis is critical, model interpretability is just as important for enabling financial analysts to make wise choices. In order to solve this, we have measured the global and local contributions of each feature using SHAP (Shapley additive explanation). According to global interpretability, the top 4 parameters influencing credit risk assessment are checking account status, loan purpose, borrower age, credit history, and interest rate/installment rate. Moreover, local interpretability analysis reveals quantitative and direction differences in feature contributions. These revelations not only broaden our knowledge of credit assessment services but also highlight how important a role they could play in attracting new clients and generating income. This paper also highlights how the suggested approach may be scaled to other imbalanced real-world datasets, demonstrating how it can improve model performance in terms of AUC, G-mean, and F-measure.
期刊介绍:
The Journal of Forecasting is an international journal that publishes refereed papers on forecasting. It is multidisciplinary, welcoming papers dealing with any aspect of forecasting: theoretical, practical, computational and methodological. A broad interpretation of the topic is taken with approaches from various subject areas, such as statistics, economics, psychology, systems engineering and social sciences, all encouraged. Furthermore, the Journal welcomes a wide diversity of applications in such fields as business, government, technology and the environment. Of particular interest are papers dealing with modelling issues and the relationship of forecasting systems to decision-making processes.