Machine Learning for the Unlisted: Enhancing MSME Default Prediction with Public Market Signals

IF 5.9 1区经济学 Q1 BUSINESS, FINANCE

Journal of Corporate Finance Pub Date : 2025-06-11 DOI:10.1016/j.jcorpfin.2025.102830

Alessandro Bitetto , Stefano Filomeni , Michele Modina

{"title":"Machine Learning for the Unlisted: Enhancing MSME Default Prediction with Public Market Signals","authors":"Alessandro Bitetto , Stefano Filomeni , Michele Modina","doi":"10.1016/j.jcorpfin.2025.102830","DOIUrl":null,"url":null,"abstract":"<div><div>This paper contributes to the growing body of research on private firms, particularly private firm accounting. We explore the economic factors that drive improvements in the default prediction of unlisted private firms using peers’ market-based information. Specifically, we examine how the market-based default probability of a peer firm can provide valuable insights into the often noisy accounting data of private firms. Our analysis delves deeply into these economic issues to uncover essential insights. To address our research question, we utilize a granular proprietary dataset of 10,136 Italian micro-, small-, and mid-sized enterprises (MSMEs) that are required to disclose their financial statements publicly. We propose a novel public–private firm mapping approach to investigate whether incorporating peers’ market-based information improves the accuracy of default predictions for private unlisted firms. Our mapping approach matches the market information of listed firms with private firms through a data-driven clustering technique using Neural Network Autoencoder. This method enables us to link the Merton Probability of Default (PD) of public peers to the corresponding private firms within the same cluster. We then apply five statistical techniques – linear models, multivariate adaptive regression splines, support vector machines, k-nearest neighbours and random forests – to predict corporate default among private firms, comparing model performance with and without the inclusion of Merton’s PD estimated using peers’ market-based information. To assess the contribution of each predictor, we employ Shapley values. Our results demonstrate a significant improvement in default prediction for unlisted private firms when incorporating peers’ market-based information, confirming that the noisy accounting data of private firms alone hinders accurate default prediction. Furthermore, our findings highlight the importance for banks to broaden the scope of information used in credit risk assessments of private firms. These results have important policy implications for financial institutions and policymakers, providing a tool to mitigate the challenges posed by the noisy information disclosure of MSMEs while ensuring more accurate credit risk assessments.</div></div>","PeriodicalId":15525,"journal":{"name":"Journal of Corporate Finance","volume":"94 ","pages":"Article 102830"},"PeriodicalIF":5.9000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Corporate Finance","FirstCategoryId":"96","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0929119925000987","RegionNum":1,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}

引用次数: 0

Abstract

This paper contributes to the growing body of research on private firms, particularly private firm accounting. We explore the economic factors that drive improvements in the default prediction of unlisted private firms using peers’ market-based information. Specifically, we examine how the market-based default probability of a peer firm can provide valuable insights into the often noisy accounting data of private firms. Our analysis delves deeply into these economic issues to uncover essential insights. To address our research question, we utilize a granular proprietary dataset of 10,136 Italian micro-, small-, and mid-sized enterprises (MSMEs) that are required to disclose their financial statements publicly. We propose a novel public–private firm mapping approach to investigate whether incorporating peers’ market-based information improves the accuracy of default predictions for private unlisted firms. Our mapping approach matches the market information of listed firms with private firms through a data-driven clustering technique using Neural Network Autoencoder. This method enables us to link the Merton Probability of Default (PD) of public peers to the corresponding private firms within the same cluster. We then apply five statistical techniques – linear models, multivariate adaptive regression splines, support vector machines, k-nearest neighbours and random forests – to predict corporate default among private firms, comparing model performance with and without the inclusion of Merton’s PD estimated using peers’ market-based information. To assess the contribution of each predictor, we employ Shapley values. Our results demonstrate a significant improvement in default prediction for unlisted private firms when incorporating peers’ market-based information, confirming that the noisy accounting data of private firms alone hinders accurate default prediction. Furthermore, our findings highlight the importance for banks to broaden the scope of information used in credit risk assessments of private firms. These results have important policy implications for financial institutions and policymakers, providing a tool to mitigate the challenges posed by the noisy information disclosure of MSMEs while ensuring more accurate credit risk assessments.

查看原文本刊更多论文

非上市企业的机器学习：利用公开市场信号增强中小微企业违约预测

本文对私营企业，特别是私营企业会计的研究做出了贡献。我们利用同行的市场信息，探讨推动非上市私营企业违约预测改善的经济因素。具体来说，我们研究了同行公司基于市场的违约概率如何为私营公司经常嘈杂的会计数据提供有价值的见解。我们的分析深入研究了这些经济问题，以揭示重要的见解。为了解决我们的研究问题，我们利用了10136家意大利微型、小型和中型企业（MSMEs）的颗粒专有数据集，这些企业需要公开披露其财务报表。我们提出了一种新的公私公司映射方法，以研究纳入同行市场信息是否提高了私营非上市公司违约预测的准确性。我们的映射方法通过使用神经网络自动编码器的数据驱动聚类技术来匹配上市公司与私营公司的市场信息。这种方法使我们能够将公共同行的默顿违约概率（PD）与同一集群内相应的私营企业联系起来。然后，我们应用五种统计技术——线性模型、多元自适应样条回归、支持向量机、k近邻和随机森林——来预测私营企业的企业违约，比较模型在包含默顿PD和不包含默顿PD的情况下的表现，这些PD是利用同行的市场信息估计的。为了评估每个预测因子的贡献，我们采用沙普利值。我们的研究结果表明，当纳入同行的市场信息时，非上市私营企业的违约预测显著改善，证实了私营企业的嘈杂会计数据本身阻碍了准确的违约预测。此外，我们的研究结果强调了银行在私营企业信用风险评估中扩大信息范围的重要性。这些结果对金融机构和政策制定者具有重要的政策意义，为减轻中小微企业信息披露带来的挑战提供了工具，同时确保更准确的信用风险评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Corporate Finance BUSINESS, FINANCE-

CiteScore

11.80

自引率

3.30%

发文量

期刊介绍： The Journal of Corporate Finance aims to publish high quality, original manuscripts that analyze issues related to corporate finance. Contributions can be of a theoretical, empirical, or clinical nature. Topical areas of interest include, but are not limited to: financial structure, payout policies, corporate restructuring, financial contracts, corporate governance arrangements, the economics of organizations, the influence of legal structures, and international financial management. Papers that apply asset pricing and microstructure analysis to corporate finance issues are also welcome.