From Black Box to Clarity: Machine Learning and Agnostic Techniques in Credit Risk Management

IF 1.2 Q3 BUSINESS, FINANCE

Journal of Corporate Accounting and Finance Pub Date : 2026-04-05 Epub Date: 2025-11-19 DOI:10.1002/jcaf.70020

Monia Antar, Rohail Hassan, Dora Barka

{"title":"From Black Box to Clarity: Machine Learning and Agnostic Techniques in Credit Risk Management","authors":"Monia Antar, Rohail Hassan, Dora Barka","doi":"10.1002/jcaf.70020","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Credit risk, a pivotal bank concern, was initially addressed in the Basel framework through the capital adequacy ratio. The need for superior risk models remains pressing as the banking landscape evolves. This research examines the effectiveness of machine learning in credit risk assessment using a dataset comprising over 280 corporate loan applications from a public Tunisian bank for approved loans in 2023's challenging economic conditions. This study compares logistic regression, neural networks, and random forests, implementing preprocessing steps, including missing value treatment and variable selection through Lasso and Ridge regularization. Our findings reveal that random forests achieve 94% accuracy in predicting default probabilities, outperforming neural networks (92%) and logistic regression (88%). Return on assets emerges as the most significant predictor for random forests, while the debt-to-equity ratio dominates neural networks and logistic regression predictions. We implement a novel three-tier interpretability framework combining SHAP, LIME, and Partial Dependence Plots (PDP) to address the “black box” challenge of machine learning algorithms. This comprehensive approach enhances model transparency and reveals critical financial thresholds specific to emerging markets, which is particularly valuable given Tunisia's economic context. The results demonstrate that sophisticated analytics combined with robust interpretability methods can significantly improve credit risk assessment in challenging economic environments. The implications extend beyond traditional banks to microfinance institutions, offering a framework that balances advanced prediction capabilities with transparent decision-making processes. This approach proves particularly valuable in emerging markets, where default patterns differ significantly from those in developed economies.</p>\n </div>","PeriodicalId":44561,"journal":{"name":"Journal of Corporate Accounting and Finance","volume":"37 2","pages":"84-98"},"PeriodicalIF":1.2000,"publicationDate":"2026-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Corporate Accounting and Finance","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jcaf.70020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/11/19 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}

引用次数: 0

Abstract

Credit risk, a pivotal bank concern, was initially addressed in the Basel framework through the capital adequacy ratio. The need for superior risk models remains pressing as the banking landscape evolves. This research examines the effectiveness of machine learning in credit risk assessment using a dataset comprising over 280 corporate loan applications from a public Tunisian bank for approved loans in 2023's challenging economic conditions. This study compares logistic regression, neural networks, and random forests, implementing preprocessing steps, including missing value treatment and variable selection through Lasso and Ridge regularization. Our findings reveal that random forests achieve 94% accuracy in predicting default probabilities, outperforming neural networks (92%) and logistic regression (88%). Return on assets emerges as the most significant predictor for random forests, while the debt-to-equity ratio dominates neural networks and logistic regression predictions. We implement a novel three-tier interpretability framework combining SHAP, LIME, and Partial Dependence Plots (PDP) to address the “black box” challenge of machine learning algorithms. This comprehensive approach enhances model transparency and reveals critical financial thresholds specific to emerging markets, which is particularly valuable given Tunisia's economic context. The results demonstrate that sophisticated analytics combined with robust interpretability methods can significantly improve credit risk assessment in challenging economic environments. The implications extend beyond traditional banks to microfinance institutions, offering a framework that balances advanced prediction capabilities with transparent decision-making processes. This approach proves particularly valuable in emerging markets, where default patterns differ significantly from those in developed economies.

查看原文本刊更多论文

从黑箱到清晰：信用风险管理中的机器学习和不可知论技术

信贷风险是银行的一个关键担忧，最初在巴塞尔框架中通过资本充足率解决了这一问题。随着银行业格局的演变，对卓越风险模型的需求依然迫切。本研究使用一个数据集来检验机器学习在信用风险评估中的有效性，该数据集包括突尼斯一家公共银行在2023年具有挑战性的经济条件下批准贷款的280多份企业贷款申请。本研究比较了逻辑回归、神经网络和随机森林的预处理步骤，包括缺失值处理和通过Lasso和Ridge正则化的变量选择。我们的研究结果表明，随机森林在预测违约概率方面达到94%的准确率，优于神经网络（92%）和逻辑回归（88%）。资产回报率成为随机森林最重要的预测指标，而债务股本比则主导着神经网络和逻辑回归预测。我们实现了一个新的三层可解释性框架，结合了SHAP， LIME和部分依赖图（PDP）来解决机器学习算法的“黑箱”挑战。这种全面的方法提高了模型的透明度，揭示了新兴市场特有的关键金融门槛，考虑到突尼斯的经济背景，这一点尤其有价值。结果表明，复杂的分析与稳健的可解释性方法相结合，可以显著改善具有挑战性的经济环境中的信用风险评估。其影响从传统银行延伸到小额信贷机构，提供了一个框架，平衡了先进的预测能力和透明的决策过程。事实证明，这种方法在新兴市场尤其有价值，因为新兴市场的违约模式与发达经济体有很大不同。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Corporate Accounting and Finance BUSINESS, FINANCE-

CiteScore

2.30

自引率

7.10%

发文量