Effects of classification, feature selection, and resampling methods on bankruptcy prediction of small and medium-sized enterprises

Q1 Economics, Econometrics and Finance
Lenka Papíková, Mário Papík
{"title":"Effects of classification, feature selection, and resampling methods on bankruptcy prediction of small and medium-sized enterprises","authors":"Lenka Papíková,&nbsp;Mário Papík","doi":"10.1002/isaf.1521","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Small and medium-sized enterprises are the pillars of an economy, and their poor performance has a negative impact on living standards of population and country development. This study analyzes real-life data of 89,851 small and medium-sized enterprises, out of which 295 have declared bankruptcy. The analysis is performed via 27 financial ratios. The study framework combines seven classifications and three resampling and seven feature selection methods. Out of all classification methods applied, CatBoost has achieved the best results for all combinations of resampling and feature selection methods. CatBoost surpassed the results of other classification methods for the area under curve parameter, achieving a value of 99.95%. The application of resampling methods on different classification models has not identified a statistically significant level of improvement in any of the resampling methods. This finding has also been observed for feature selection methods. Based on these findings, we assume that individual resampling and feature selection methods do not improve model performance compared with the original imbalanced sample's results. Our results suggest that, even though the data sample may be significantly imbalanced with a minority of bankrupt companies, most classification algorithms can handle this imbalance and achieve interesting results. Moreover, our findings provide broad practical application for all stakeholders who could need to detect bankrupting companies.</p>\n </div>","PeriodicalId":53473,"journal":{"name":"Intelligent Systems in Accounting, Finance and Management","volume":"29 4","pages":"254-281"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Systems in Accounting, Finance and Management","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/isaf.1521","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Economics, Econometrics and Finance","Score":null,"Total":0}
引用次数: 2

Abstract

Small and medium-sized enterprises are the pillars of an economy, and their poor performance has a negative impact on living standards of population and country development. This study analyzes real-life data of 89,851 small and medium-sized enterprises, out of which 295 have declared bankruptcy. The analysis is performed via 27 financial ratios. The study framework combines seven classifications and three resampling and seven feature selection methods. Out of all classification methods applied, CatBoost has achieved the best results for all combinations of resampling and feature selection methods. CatBoost surpassed the results of other classification methods for the area under curve parameter, achieving a value of 99.95%. The application of resampling methods on different classification models has not identified a statistically significant level of improvement in any of the resampling methods. This finding has also been observed for feature selection methods. Based on these findings, we assume that individual resampling and feature selection methods do not improve model performance compared with the original imbalanced sample's results. Our results suggest that, even though the data sample may be significantly imbalanced with a minority of bankrupt companies, most classification algorithms can handle this imbalance and achieve interesting results. Moreover, our findings provide broad practical application for all stakeholders who could need to detect bankrupting companies.

分类、特征选择和重采样方法对中小企业破产预测的影响
中小企业是经济的支柱,中小企业经营不善对人民生活水平和国家发展产生负面影响。这项研究分析了89851家中小企业的真实数据,其中295家已经宣布破产。分析是通过27个财务比率进行的。该研究框架结合了7种分类、3种重采样和7种特征选择方法。在所有应用的分类方法中,CatBoost在所有重采样和特征选择方法的组合中都取得了最好的结果。CatBoost在曲线下面积参数上优于其他分类方法,准确率达到99.95%。重新抽样方法在不同分类模型上的应用并没有发现任何一种重新抽样方法在统计上有显著的改善。这一发现也被观察到特征选择方法。基于这些发现,我们假设与原始不平衡样本的结果相比,个体重采样和特征选择方法并不能提高模型的性能。我们的结果表明,尽管数据样本可能与少数破产公司显著不平衡,但大多数分类算法都可以处理这种不平衡并获得有趣的结果。此外,我们的研究结果为所有可能需要发现破产公司的利益相关者提供了广泛的实际应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Intelligent Systems in Accounting, Finance and Management
Intelligent Systems in Accounting, Finance and Management Economics, Econometrics and Finance-Finance
CiteScore
6.00
自引率
0.00%
发文量
0
期刊介绍: Intelligent Systems in Accounting, Finance and Management is a quarterly international journal which publishes original, high quality material dealing with all aspects of intelligent systems as they relate to the fields of accounting, economics, finance, marketing and management. In addition, the journal also is concerned with related emerging technologies, including big data, business intelligence, social media and other technologies. It encourages the development of novel technologies, and the embedding of new and existing technologies into applications of real, practical value. Therefore, implementation issues are of as much concern as development issues. The journal is designed to appeal to academics in the intelligent systems, emerging technologies and business fields, as well as to advanced practitioners who wish to improve the effectiveness, efficiency, or economy of their working practices. A special feature of the journal is the use of two groups of reviewers, those who specialize in intelligent systems work, and also those who specialize in applications areas. Reviewers are asked to address issues of originality and actual or potential impact on research, teaching, or practice in the accounting, finance, or management fields. Authors working on conceptual developments or on laboratory-based explorations of data sets therefore need to address the issue of potential impact at some level in submissions to the journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信