Hang Yang, Peng Li, Xiaobin Guo, Huajun Chen, Zhiqiang Lin
{"title":"不完全数据对增量决策树的影响综述","authors":"Hang Yang, Peng Li, Xiaobin Guo, Huajun Chen, Zhiqiang Lin","doi":"10.1504/IJICT.2018.10008905","DOIUrl":null,"url":null,"abstract":"Decision tree, as one of the most widely used methods in data mining, has been used in many realistic application. Incremental decision tree handles streaming data scenario that is applicable for big data analysis. However, imperfect data are unavoidable in real-world applications. Studying the state-of-art incremental decision tree induction using Hoeffding bound, we investigated the influence of imperfect data on decision tree model. Additionally we found the imperfect data worsen the performance of decision tree learning, resulting in worse accuracy and more consumed resource. This paper would be good reference for the future research. When thinking of a new generation of incremental decision tree, we should try to overcome the negative effects of imperfect data.","PeriodicalId":395610,"journal":{"name":"2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"A Review: The Effects of Imperfect Data on Incremental Decision Tree\",\"authors\":\"Hang Yang, Peng Li, Xiaobin Guo, Huajun Chen, Zhiqiang Lin\",\"doi\":\"10.1504/IJICT.2018.10008905\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Decision tree, as one of the most widely used methods in data mining, has been used in many realistic application. Incremental decision tree handles streaming data scenario that is applicable for big data analysis. However, imperfect data are unavoidable in real-world applications. Studying the state-of-art incremental decision tree induction using Hoeffding bound, we investigated the influence of imperfect data on decision tree model. Additionally we found the imperfect data worsen the performance of decision tree learning, resulting in worse accuracy and more consumed resource. This paper would be good reference for the future research. When thinking of a new generation of incremental decision tree, we should try to overcome the negative effects of imperfect data.\",\"PeriodicalId\":395610,\"journal\":{\"name\":\"2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/IJICT.2018.10008905\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJICT.2018.10008905","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Review: The Effects of Imperfect Data on Incremental Decision Tree
Decision tree, as one of the most widely used methods in data mining, has been used in many realistic application. Incremental decision tree handles streaming data scenario that is applicable for big data analysis. However, imperfect data are unavoidable in real-world applications. Studying the state-of-art incremental decision tree induction using Hoeffding bound, we investigated the influence of imperfect data on decision tree model. Additionally we found the imperfect data worsen the performance of decision tree learning, resulting in worse accuracy and more consumed resource. This paper would be good reference for the future research. When thinking of a new generation of incremental decision tree, we should try to overcome the negative effects of imperfect data.