{"title":"利用过采样和CatBoost增强不平衡数据破产预测的智能模型","authors":"Samar Aly, Marco Alfonse, Abdel-Badeeh M. Salem","doi":"10.21608/ijicis.2022.105654.1138","DOIUrl":null,"url":null,"abstract":": Bankruptcy prediction is one of the most significant financial decision-making problems, which prevents financial institutions from sever risks. Most of bankruptcy datasets suffer from imbalanced distribution between output classes, which could lead to misclassification in the prediction results. This research paper presents an efficient bankruptcy prediction model that can handle imbalanced dataset problem by applying Synthetic Minority Oversampling Technique (SMOTE) as a pre-processing step. It applies ensemble-based machine learning classifier, namely, Categorical Boosting (CatBoost) to classify between active and inactive classes. Moreover, the proposed model reduces the dimensionality of the used dataset to increase predictive performance by using three different feature selection techniques. The proposed model is evaluated across the most popular imbalanced bankrupt dataset, which is the Polish dataset. The obtained results proved the efficiency of the applied model, especially in terms of the accuracy. The accuracies ofthe proposed model in predicting bankruptcy on the Polish five years datasets are 98%, 98%, 97%, 97% and 95%, respectively.","PeriodicalId":244591,"journal":{"name":"International Journal of Intelligent Computing and Information Sciences","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Intelligent Model for Enhancing the Bankruptcy Prediction with Imbalanced Data Using Oversampling and CatBoost\",\"authors\":\"Samar Aly, Marco Alfonse, Abdel-Badeeh M. Salem\",\"doi\":\"10.21608/ijicis.2022.105654.1138\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": Bankruptcy prediction is one of the most significant financial decision-making problems, which prevents financial institutions from sever risks. Most of bankruptcy datasets suffer from imbalanced distribution between output classes, which could lead to misclassification in the prediction results. This research paper presents an efficient bankruptcy prediction model that can handle imbalanced dataset problem by applying Synthetic Minority Oversampling Technique (SMOTE) as a pre-processing step. It applies ensemble-based machine learning classifier, namely, Categorical Boosting (CatBoost) to classify between active and inactive classes. Moreover, the proposed model reduces the dimensionality of the used dataset to increase predictive performance by using three different feature selection techniques. The proposed model is evaluated across the most popular imbalanced bankrupt dataset, which is the Polish dataset. The obtained results proved the efficiency of the applied model, especially in terms of the accuracy. The accuracies ofthe proposed model in predicting bankruptcy on the Polish five years datasets are 98%, 98%, 97%, 97% and 95%, respectively.\",\"PeriodicalId\":244591,\"journal\":{\"name\":\"International Journal of Intelligent Computing and Information Sciences\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Intelligent Computing and Information Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21608/ijicis.2022.105654.1138\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Computing and Information Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21608/ijicis.2022.105654.1138","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Intelligent Model for Enhancing the Bankruptcy Prediction with Imbalanced Data Using Oversampling and CatBoost
: Bankruptcy prediction is one of the most significant financial decision-making problems, which prevents financial institutions from sever risks. Most of bankruptcy datasets suffer from imbalanced distribution between output classes, which could lead to misclassification in the prediction results. This research paper presents an efficient bankruptcy prediction model that can handle imbalanced dataset problem by applying Synthetic Minority Oversampling Technique (SMOTE) as a pre-processing step. It applies ensemble-based machine learning classifier, namely, Categorical Boosting (CatBoost) to classify between active and inactive classes. Moreover, the proposed model reduces the dimensionality of the used dataset to increase predictive performance by using three different feature selection techniques. The proposed model is evaluated across the most popular imbalanced bankrupt dataset, which is the Polish dataset. The obtained results proved the efficiency of the applied model, especially in terms of the accuracy. The accuracies ofthe proposed model in predicting bankruptcy on the Polish five years datasets are 98%, 98%, 97%, 97% and 95%, respectively.