{"title":"Data Mining based Handling Missing Data","authors":"A. Dubey, A. Rasool","doi":"10.1109/I-SMAC47947.2019.9032631","DOIUrl":null,"url":null,"abstract":"Today, a huge amount of data are generated in many applications than ever before. However, most of the application's data are affected by the issue of missing values. This issue has gained significant attention throughout the statistical research. Several obvious examples involve repositories related to the management of equipment, business applications, and surveys. One of the usual ways to handle this issue is to fill the value using imputation. Several imputation techniques have been proposed until now to handle the missing data. With the rapidly increasing size of the dataset, a modern imputation approach algorithm is required. In this paper, we provide an extensive overview of the current imputation methods, with a special focus on algorithms utilizing the local or global correlation available within the dataset. Furthermore, the paper shows how the prediction made can be validated and some possible future directions. This paper is expected to give the researchers a good grasp of the current trends in this domain and to enable them to create a more robust and efficient algorithm.","PeriodicalId":275791,"journal":{"name":"2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I-SMAC47947.2019.9032631","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Today, a huge amount of data are generated in many applications than ever before. However, most of the application's data are affected by the issue of missing values. This issue has gained significant attention throughout the statistical research. Several obvious examples involve repositories related to the management of equipment, business applications, and surveys. One of the usual ways to handle this issue is to fill the value using imputation. Several imputation techniques have been proposed until now to handle the missing data. With the rapidly increasing size of the dataset, a modern imputation approach algorithm is required. In this paper, we provide an extensive overview of the current imputation methods, with a special focus on algorithms utilizing the local or global correlation available within the dataset. Furthermore, the paper shows how the prediction made can be validated and some possible future directions. This paper is expected to give the researchers a good grasp of the current trends in this domain and to enable them to create a more robust and efficient algorithm.