{"title":"kNN Imputation Versus Mean Imputation for Handling Missing Data on Vulnerability Index in Dealing with Covid-19 in Indonesia","authors":"Heru Nugroho, N. P. Utama, K. Surendro","doi":"10.1145/3587828.3587832","DOIUrl":null,"url":null,"abstract":"The COVID-19 virus has rapidly spread throughout the world, and the WHO declared it a pandemic on March 11, 2020. Previous research considered five domains associated with the social vulnerability index in the context of pandemic infection management and mitigation in the community, such as socioeconomic conditions, demographic composition, housing and hygiene, availability of health care facilities, and epidemiological factors related to COVID-19. The Katadata Insight Center (KIC) investigates the vulnerability index of Indonesian provinces to the coronavirus based on the risks of regional characteristics, population health, and mobility. There is a chance that the supporting data is either incomplete or missing, which is a common flaw that influences the prediction system's results and renders it ineffective. This paper will compare the kNN-based imputation method with the mean imputation to handle missing data, which causes the provincial vulnerability index in Indonesia to be measured incorrectly. The vulnerability index associated with COVID-19 should be one of the factors considered by the Indonesian government when making decisions or establishing a lockdown strategy and large-scale restriction rules in each province. When missing data is discovered, kNN imputation and mean imputation can be used as a solution. Based on the results of the experiments, the mean imputation has a much lower average RMSE performance than the kNN imputation method in the dataset of vulnerability index in dealing with COVID-19 in Indonesia.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3587828.3587832","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The COVID-19 virus has rapidly spread throughout the world, and the WHO declared it a pandemic on March 11, 2020. Previous research considered five domains associated with the social vulnerability index in the context of pandemic infection management and mitigation in the community, such as socioeconomic conditions, demographic composition, housing and hygiene, availability of health care facilities, and epidemiological factors related to COVID-19. The Katadata Insight Center (KIC) investigates the vulnerability index of Indonesian provinces to the coronavirus based on the risks of regional characteristics, population health, and mobility. There is a chance that the supporting data is either incomplete or missing, which is a common flaw that influences the prediction system's results and renders it ineffective. This paper will compare the kNN-based imputation method with the mean imputation to handle missing data, which causes the provincial vulnerability index in Indonesia to be measured incorrectly. The vulnerability index associated with COVID-19 should be one of the factors considered by the Indonesian government when making decisions or establishing a lockdown strategy and large-scale restriction rules in each province. When missing data is discovered, kNN imputation and mean imputation can be used as a solution. Based on the results of the experiments, the mean imputation has a much lower average RMSE performance than the kNN imputation method in the dataset of vulnerability index in dealing with COVID-19 in Indonesia.