{"title":"Financial Risk Assessment Model Based on Stochastic Forest and Decision Tree Hybrid Algorithm","authors":"Peipei Pan","doi":"10.1109/ECICE52819.2021.9645735","DOIUrl":null,"url":null,"abstract":"The random forest model is a classification model whose underlying meta-classifier is the decision tree of the CART algorithm. The random forest model adopts sampling and putting back, that is, each training set is constructed based on the Bagging method and a new training set is randomly selected without additional pruning. In this paper, a financial risk assessment model based on the random forest is proposed. Firstly, six types of variables are set according to the characteristics of samples, and the outliers are eliminated by using the box graph model. After processing, the samples are modeled and tested by the cross verification method. The model adopts 600 decision trees, and the split node is 4. After calculation, the model precision is 82.27%, the recall rate is 83.55%, and the F1 value is 0.8196, which has good prediction accuracy. By comparing the accuracy of the random forest model with KNN and SVM models, it was found that the accuracy of the random forest model was 9.56% and 3.36% higher than the other two models respectively, which was obviously due to KNN and SVM in the accuracy of processing high-dimension big data samples.","PeriodicalId":176225,"journal":{"name":"2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ECICE52819.2021.9645735","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The random forest model is a classification model whose underlying meta-classifier is the decision tree of the CART algorithm. The random forest model adopts sampling and putting back, that is, each training set is constructed based on the Bagging method and a new training set is randomly selected without additional pruning. In this paper, a financial risk assessment model based on the random forest is proposed. Firstly, six types of variables are set according to the characteristics of samples, and the outliers are eliminated by using the box graph model. After processing, the samples are modeled and tested by the cross verification method. The model adopts 600 decision trees, and the split node is 4. After calculation, the model precision is 82.27%, the recall rate is 83.55%, and the F1 value is 0.8196, which has good prediction accuracy. By comparing the accuracy of the random forest model with KNN and SVM models, it was found that the accuracy of the random forest model was 9.56% and 3.36% higher than the other two models respectively, which was obviously due to KNN and SVM in the accuracy of processing high-dimension big data samples.