{"title":"结合5×2cv F检验的分类回归树与逻辑回归算法比较实证研究","authors":"None Fayza Annisa Febrianti, None Dodi Vionanda, None Yenni Kurniawati, None Fadhilah Fitri","doi":"10.24036/ujsds/vol1-iss4/85","DOIUrl":null,"url":null,"abstract":"Classification is a method to estimate the class of an object based on its characteristics. Several learning algorithms can be applied in classification, such as Classification and Regression Tree (CART) and logistic regression. The main goal of classification is to find the best learning algorithm that can be applied to get the best classifier. In comparing two learning algorithms, a direct comparison by seeing the smaller prediction error rate may be possible when the difference is very clear. In this case, direct comparison is misleading and resulting inadequate conclusions. Therefore, a statistical test is needed to determine whether the difference is real or random. The results of the 5×2cv paired t-test sometimes reject and sometimes fail to reject the hypothesis. It is distracting because the changing of the error rate difference should not affect the test result. Meanwhile, the overall results of the combined 5×2cv F test show that the tests fail to reject the hypothesis. This indicates that CART and logistic regression perform identically in this case.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Emprical Study for Algorithms Comparison of Classification and Regression Tree and Logistic Regression Using Combined 5×2cv F Test\",\"authors\":\"None Fayza Annisa Febrianti, None Dodi Vionanda, None Yenni Kurniawati, None Fadhilah Fitri\",\"doi\":\"10.24036/ujsds/vol1-iss4/85\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Classification is a method to estimate the class of an object based on its characteristics. Several learning algorithms can be applied in classification, such as Classification and Regression Tree (CART) and logistic regression. The main goal of classification is to find the best learning algorithm that can be applied to get the best classifier. In comparing two learning algorithms, a direct comparison by seeing the smaller prediction error rate may be possible when the difference is very clear. In this case, direct comparison is misleading and resulting inadequate conclusions. Therefore, a statistical test is needed to determine whether the difference is real or random. The results of the 5×2cv paired t-test sometimes reject and sometimes fail to reject the hypothesis. It is distracting because the changing of the error rate difference should not affect the test result. Meanwhile, the overall results of the combined 5×2cv F test show that the tests fail to reject the hypothesis. This indicates that CART and logistic regression perform identically in this case.\",\"PeriodicalId\":220933,\"journal\":{\"name\":\"UNP Journal of Statistics and Data Science\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"UNP Journal of Statistics and Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.24036/ujsds/vol1-iss4/85\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"UNP Journal of Statistics and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24036/ujsds/vol1-iss4/85","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Emprical Study for Algorithms Comparison of Classification and Regression Tree and Logistic Regression Using Combined 5×2cv F Test
Classification is a method to estimate the class of an object based on its characteristics. Several learning algorithms can be applied in classification, such as Classification and Regression Tree (CART) and logistic regression. The main goal of classification is to find the best learning algorithm that can be applied to get the best classifier. In comparing two learning algorithms, a direct comparison by seeing the smaller prediction error rate may be possible when the difference is very clear. In this case, direct comparison is misleading and resulting inadequate conclusions. Therefore, a statistical test is needed to determine whether the difference is real or random. The results of the 5×2cv paired t-test sometimes reject and sometimes fail to reject the hypothesis. It is distracting because the changing of the error rate difference should not affect the test result. Meanwhile, the overall results of the combined 5×2cv F test show that the tests fail to reject the hypothesis. This indicates that CART and logistic regression perform identically in this case.