{"title":"决策树:训练、测试和兼容性缺失值技术回顾","authors":"Sachin Gavankar, S. Sawarkar","doi":"10.1109/AIMS.2015.29","DOIUrl":null,"url":null,"abstract":"Data mining rely on large amount of data to make learning model and the quality of data is very important. One of the important problem under data quality is the presence of missing values. Missing values can occur in both at the time of training and at the time of testing. There are many methods proposed to deal with missing values in training data. Many of them resort to imputation techniques. However, Very few methods are there to deal with the missing values at testing/prediction time. In this paper, we discuss and summarize various strategies to deal with this problem both at training and testing time. Also, we have discussed the compatibility between various methods at training and testing to achieve better results.","PeriodicalId":121874,"journal":{"name":"2015 3rd International Conference on Artificial Intelligence, Modelling and Simulation (AIMS)","volume":"356 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":"{\"title\":\"Decision Tree: Review of Techniques for Missing Values at Training, Testing and Compatibility\",\"authors\":\"Sachin Gavankar, S. Sawarkar\",\"doi\":\"10.1109/AIMS.2015.29\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data mining rely on large amount of data to make learning model and the quality of data is very important. One of the important problem under data quality is the presence of missing values. Missing values can occur in both at the time of training and at the time of testing. There are many methods proposed to deal with missing values in training data. Many of them resort to imputation techniques. However, Very few methods are there to deal with the missing values at testing/prediction time. In this paper, we discuss and summarize various strategies to deal with this problem both at training and testing time. Also, we have discussed the compatibility between various methods at training and testing to achieve better results.\",\"PeriodicalId\":121874,\"journal\":{\"name\":\"2015 3rd International Conference on Artificial Intelligence, Modelling and Simulation (AIMS)\",\"volume\":\"356 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 3rd International Conference on Artificial Intelligence, Modelling and Simulation (AIMS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIMS.2015.29\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 3rd International Conference on Artificial Intelligence, Modelling and Simulation (AIMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIMS.2015.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Decision Tree: Review of Techniques for Missing Values at Training, Testing and Compatibility
Data mining rely on large amount of data to make learning model and the quality of data is very important. One of the important problem under data quality is the presence of missing values. Missing values can occur in both at the time of training and at the time of testing. There are many methods proposed to deal with missing values in training data. Many of them resort to imputation techniques. However, Very few methods are there to deal with the missing values at testing/prediction time. In this paper, we discuss and summarize various strategies to deal with this problem both at training and testing time. Also, we have discussed the compatibility between various methods at training and testing to achieve better results.