决策树:训练、测试和兼容性缺失值技术回顾

Sachin Gavankar, S. Sawarkar
{"title":"决策树:训练、测试和兼容性缺失值技术回顾","authors":"Sachin Gavankar, S. Sawarkar","doi":"10.1109/AIMS.2015.29","DOIUrl":null,"url":null,"abstract":"Data mining rely on large amount of data to make learning model and the quality of data is very important. One of the important problem under data quality is the presence of missing values. Missing values can occur in both at the time of training and at the time of testing. There are many methods proposed to deal with missing values in training data. Many of them resort to imputation techniques. However, Very few methods are there to deal with the missing values at testing/prediction time. In this paper, we discuss and summarize various strategies to deal with this problem both at training and testing time. Also, we have discussed the compatibility between various methods at training and testing to achieve better results.","PeriodicalId":121874,"journal":{"name":"2015 3rd International Conference on Artificial Intelligence, Modelling and Simulation (AIMS)","volume":"356 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":"{\"title\":\"Decision Tree: Review of Techniques for Missing Values at Training, Testing and Compatibility\",\"authors\":\"Sachin Gavankar, S. Sawarkar\",\"doi\":\"10.1109/AIMS.2015.29\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data mining rely on large amount of data to make learning model and the quality of data is very important. One of the important problem under data quality is the presence of missing values. Missing values can occur in both at the time of training and at the time of testing. There are many methods proposed to deal with missing values in training data. Many of them resort to imputation techniques. However, Very few methods are there to deal with the missing values at testing/prediction time. In this paper, we discuss and summarize various strategies to deal with this problem both at training and testing time. Also, we have discussed the compatibility between various methods at training and testing to achieve better results.\",\"PeriodicalId\":121874,\"journal\":{\"name\":\"2015 3rd International Conference on Artificial Intelligence, Modelling and Simulation (AIMS)\",\"volume\":\"356 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 3rd International Conference on Artificial Intelligence, Modelling and Simulation (AIMS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIMS.2015.29\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 3rd International Conference on Artificial Intelligence, Modelling and Simulation (AIMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIMS.2015.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

摘要

数据挖掘依赖于大量的数据来建立学习模型,数据的质量非常重要。数据质量的一个重要问题是缺失值的存在。缺失值可能发生在训练时和测试时。针对训练数据中缺失值的处理方法有很多。他们中的许多人求助于归罪于他人的技巧。然而,很少有方法可以在测试/预测时处理缺失值。本文讨论和总结了在训练和测试时处理这一问题的各种策略。此外,我们还讨论了各种训练和测试方法之间的兼容性,以获得更好的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Decision Tree: Review of Techniques for Missing Values at Training, Testing and Compatibility
Data mining rely on large amount of data to make learning model and the quality of data is very important. One of the important problem under data quality is the presence of missing values. Missing values can occur in both at the time of training and at the time of testing. There are many methods proposed to deal with missing values in training data. Many of them resort to imputation techniques. However, Very few methods are there to deal with the missing values at testing/prediction time. In this paper, we discuss and summarize various strategies to deal with this problem both at training and testing time. Also, we have discussed the compatibility between various methods at training and testing to achieve better results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信