{"title":"Efficient Imputation Method for Missing Data Focusing on Local Space Formed by Hyper-Rectangle Descriptors","authors":"Do Gyun Kim, J. Choi","doi":"10.5220/0007582104670472","DOIUrl":null,"url":null,"abstract":"In real world data set, there might be missing data due to various reasons. These missing values should be handled since most data analysis methods are assuming that data set is complete. Data deletion method can be simple alternative, but it is not suitable for data set with many missing values and may be lack of representativeness. Furthermore, existing data imputation methods are usually ignoring the importance of local space around missing values which may influence quality of imputed values. Based on these observations, we suggest an imputation method using Hyper-Rectangle Descriptor (ܪܦ) which can focus on local space around missing values. We describe how data imputation can be carried out by using ܪܦ, named ܪܦ_ݑݐ, and validate the performance of proposed imputation method with a numerical experiment by comparing to imputation results without ܪܦ. Also, as a future work, we depict some ideas for further development of our work.","PeriodicalId":235376,"journal":{"name":"International Conference on Operations Research and Enterprise Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Operations Research and Enterprise Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0007582104670472","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In real world data set, there might be missing data due to various reasons. These missing values should be handled since most data analysis methods are assuming that data set is complete. Data deletion method can be simple alternative, but it is not suitable for data set with many missing values and may be lack of representativeness. Furthermore, existing data imputation methods are usually ignoring the importance of local space around missing values which may influence quality of imputed values. Based on these observations, we suggest an imputation method using Hyper-Rectangle Descriptor (ܪܦ) which can focus on local space around missing values. We describe how data imputation can be carried out by using ܪܦ, named ܪܦ_ݑݐ, and validate the performance of proposed imputation method with a numerical experiment by comparing to imputation results without ܪܦ. Also, as a future work, we depict some ideas for further development of our work.