{"title":"自顶向下决策树的一种新方法——增量剪枝在数据分类中的应用","authors":"Shao Hongbo, Zhou Jing, Wu Jianhui","doi":"10.2174/1874444301507011922","DOIUrl":null,"url":null,"abstract":"Decision tree, as an important branch of machine learning, has been successfully used in several areas. The limitation of decision tree learning has led to the over-fitting of the training set, thus weakening the accuracy of decision trees. In order to overcome its defects, decision trees pruning is often adopted as a follow-up step of the decision trees learning algorithm to optimize decision trees. At present the commonly-used decision tree sample is based on statistical analysis. Due to the lack of samples, the small training set is less statistical, and it leads pruning methods to failure. Based on the previous research and study, this paper has presented a top-down decision tree incremental pruning method (TDIP), which applies the incremental learning to the comparison between the certainty and uncertainty rules so that only the former remains. In addition, to speed up the process of its pruning, a top-down search is defined to avoid the iteration of the same decision tree. The top-down decision tree incremental pruning method (TDIP) is independent of statistical characteristics of the training set. It is a robust pruning method. The experimental results show that the method maintain a good balance between accuracy and size of pruned decision trees, and is better than those traditional methods in classification problems.","PeriodicalId":153592,"journal":{"name":"The Open Automation and Control Systems Journal","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On the Application of a New Method of the Top-Down Decision TreeIncremental Pruning in Data Classification\",\"authors\":\"Shao Hongbo, Zhou Jing, Wu Jianhui\",\"doi\":\"10.2174/1874444301507011922\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Decision tree, as an important branch of machine learning, has been successfully used in several areas. The limitation of decision tree learning has led to the over-fitting of the training set, thus weakening the accuracy of decision trees. In order to overcome its defects, decision trees pruning is often adopted as a follow-up step of the decision trees learning algorithm to optimize decision trees. At present the commonly-used decision tree sample is based on statistical analysis. Due to the lack of samples, the small training set is less statistical, and it leads pruning methods to failure. Based on the previous research and study, this paper has presented a top-down decision tree incremental pruning method (TDIP), which applies the incremental learning to the comparison between the certainty and uncertainty rules so that only the former remains. In addition, to speed up the process of its pruning, a top-down search is defined to avoid the iteration of the same decision tree. The top-down decision tree incremental pruning method (TDIP) is independent of statistical characteristics of the training set. It is a robust pruning method. The experimental results show that the method maintain a good balance between accuracy and size of pruned decision trees, and is better than those traditional methods in classification problems.\",\"PeriodicalId\":153592,\"journal\":{\"name\":\"The Open Automation and Control Systems Journal\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Open Automation and Control Systems Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2174/1874444301507011922\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Open Automation and Control Systems Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/1874444301507011922","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On the Application of a New Method of the Top-Down Decision TreeIncremental Pruning in Data Classification
Decision tree, as an important branch of machine learning, has been successfully used in several areas. The limitation of decision tree learning has led to the over-fitting of the training set, thus weakening the accuracy of decision trees. In order to overcome its defects, decision trees pruning is often adopted as a follow-up step of the decision trees learning algorithm to optimize decision trees. At present the commonly-used decision tree sample is based on statistical analysis. Due to the lack of samples, the small training set is less statistical, and it leads pruning methods to failure. Based on the previous research and study, this paper has presented a top-down decision tree incremental pruning method (TDIP), which applies the incremental learning to the comparison between the certainty and uncertainty rules so that only the former remains. In addition, to speed up the process of its pruning, a top-down search is defined to avoid the iteration of the same decision tree. The top-down decision tree incremental pruning method (TDIP) is independent of statistical characteristics of the training set. It is a robust pruning method. The experimental results show that the method maintain a good balance between accuracy and size of pruned decision trees, and is better than those traditional methods in classification problems.