{"title":"A hybrid machine learning framework by incorporating categorical boosting and manifold learning for financial analysis","authors":"Yuyang Zhao , Hongbo Zhao","doi":"10.1016/j.iswa.2024.200473","DOIUrl":null,"url":null,"abstract":"<div><div>The financial analysis is essential to evaluate and assess the financial behavior and risk during the financial activities. However, it is challenging to implement the financial analysis due to the complexity of financial features and their interaction mechanism. This study developed a hybrid machine-learning framework incorporating categorical boosting (CatBoost) and manifold learning for financial analysis. CatBoost was employed to capture the financial mechanism and characterize the complex and nonlinear relationship between the financial feature and the associated financial behavior. Manifold learning was utilized to select and extract the critical financial features. The developed framework was verified and illustrated by the synthetic datasets, which are based on the financial model for the loan evaluation. The overall accuracy of the CatBoost model increased from 81.5 % to 99.1 %, and the accuracy for predicting unapproved loans increased from 64 % to 98.88 %. The developed framework significantly improves the prediction accuracy of loan-approved status and characterizes the financial behavior and mechanism well. The developed hybrid framework distinguishes between various financial features and the associated loan-approved status. Based on the developed framework, it also found that credit score and annual income are the two essential features, and the contribution of other features is almost negligible. The developed framework revealed that a credit score of 500 and an annual income of 70,000 are critical thresholds for loan approval, as set by the financial analysis model used to generate the dataset. The results show that the developed framework could extract the financial features and capture the financial mechanism during the financial analysis. It provides a scientific, reasonable, and promising approach to financial analysis and understanding financial behavior.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"25 ","pages":"Article 200473"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Systems with Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667305324001479","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The financial analysis is essential to evaluate and assess the financial behavior and risk during the financial activities. However, it is challenging to implement the financial analysis due to the complexity of financial features and their interaction mechanism. This study developed a hybrid machine-learning framework incorporating categorical boosting (CatBoost) and manifold learning for financial analysis. CatBoost was employed to capture the financial mechanism and characterize the complex and nonlinear relationship between the financial feature and the associated financial behavior. Manifold learning was utilized to select and extract the critical financial features. The developed framework was verified and illustrated by the synthetic datasets, which are based on the financial model for the loan evaluation. The overall accuracy of the CatBoost model increased from 81.5 % to 99.1 %, and the accuracy for predicting unapproved loans increased from 64 % to 98.88 %. The developed framework significantly improves the prediction accuracy of loan-approved status and characterizes the financial behavior and mechanism well. The developed hybrid framework distinguishes between various financial features and the associated loan-approved status. Based on the developed framework, it also found that credit score and annual income are the two essential features, and the contribution of other features is almost negligible. The developed framework revealed that a credit score of 500 and an annual income of 70,000 are critical thresholds for loan approval, as set by the financial analysis model used to generate the dataset. The results show that the developed framework could extract the financial features and capture the financial mechanism during the financial analysis. It provides a scientific, reasonable, and promising approach to financial analysis and understanding financial behavior.