{"title":"CART、C5.0和随机森林决策树分类算法的经验性能","authors":"Bissilimou Racidatou Orounla, Akoeugnigan Idelphonse Sode, Kolawole Valère Salako, Romain Glèlè Kakaï","doi":"10.16929/ajas/2023.1399.274","DOIUrl":null,"url":null,"abstract":"This study compares the performance of <i>CART</i>, <i>C5.0</i> and Random Forest (<i>RF</i>) algorithms. 25 continuous predictors and 25 factors were simulated using a population size of 10,000. Based on this data, sample data were generated by varying the number of predictors, the proportion of categorical versus continuous predictors and the sample size. The performance of the tree algorithms increases with sample size and the number of variables, but for <i>RF</i>, it is highly greater than the one of <i>CART</i> and <i>C5.0</i>. Irrespective of the algorithms, the performance decreases when there are more categorical variables than continuous variables.","PeriodicalId":332314,"journal":{"name":"African Journal of Applied Statistics","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Empirical Performance of CART, C5.0 and Random Forest Classification Algorithms for Decision Trees\",\"authors\":\"Bissilimou Racidatou Orounla, Akoeugnigan Idelphonse Sode, Kolawole Valère Salako, Romain Glèlè Kakaï\",\"doi\":\"10.16929/ajas/2023.1399.274\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study compares the performance of <i>CART</i>, <i>C5.0</i> and Random Forest (<i>RF</i>) algorithms. 25 continuous predictors and 25 factors were simulated using a population size of 10,000. Based on this data, sample data were generated by varying the number of predictors, the proportion of categorical versus continuous predictors and the sample size. The performance of the tree algorithms increases with sample size and the number of variables, but for <i>RF</i>, it is highly greater than the one of <i>CART</i> and <i>C5.0</i>. Irrespective of the algorithms, the performance decreases when there are more categorical variables than continuous variables.\",\"PeriodicalId\":332314,\"journal\":{\"name\":\"African Journal of Applied Statistics\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"African Journal of Applied Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.16929/ajas/2023.1399.274\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"African Journal of Applied Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.16929/ajas/2023.1399.274","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Empirical Performance of CART, C5.0 and Random Forest Classification Algorithms for Decision Trees
This study compares the performance of CART, C5.0 and Random Forest (RF) algorithms. 25 continuous predictors and 25 factors were simulated using a population size of 10,000. Based on this data, sample data were generated by varying the number of predictors, the proportion of categorical versus continuous predictors and the sample size. The performance of the tree algorithms increases with sample size and the number of variables, but for RF, it is highly greater than the one of CART and C5.0. Irrespective of the algorithms, the performance decreases when there are more categorical variables than continuous variables.