{"title":"A Fast and Effective Classification Method for Missing Data","authors":"Y. Liu, Chaoya Wang, Wenxin Sun","doi":"10.1109/ISAIEE57420.2022.00052","DOIUrl":null,"url":null,"abstract":"Missing data is common in life. The preprocessing of missing data is the premise of pattern classification. Therefore, it is necessary to use the existing reliable training data set to attribute missing data. These methods have a significant impact on dealing with ambiguity in data sets. Therefore, it is necessary and effective to use accurate data and estimation methods to imput missing data. This paper presents a fast and effective method for missing data classification. Specifically, we propose two strategies to estimate incomplete data, namely, nearest class-center imputation (NCCI) and weighted class-center imputation (WCCI). At the same time, in order to further eliminate the influence of noise in the training set, we also propose a method to optimize the training set. Finally, a conventional classifier is used to classify the estimated incomplete data. The effectiveness of the proposed method is verified by testing different datasets with related methods.","PeriodicalId":345703,"journal":{"name":"2022 International Symposium on Advances in Informatics, Electronics and Education (ISAIEE)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Symposium on Advances in Informatics, Electronics and Education (ISAIEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISAIEE57420.2022.00052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Missing data is common in life. The preprocessing of missing data is the premise of pattern classification. Therefore, it is necessary to use the existing reliable training data set to attribute missing data. These methods have a significant impact on dealing with ambiguity in data sets. Therefore, it is necessary and effective to use accurate data and estimation methods to imput missing data. This paper presents a fast and effective method for missing data classification. Specifically, we propose two strategies to estimate incomplete data, namely, nearest class-center imputation (NCCI) and weighted class-center imputation (WCCI). At the same time, in order to further eliminate the influence of noise in the training set, we also propose a method to optimize the training set. Finally, a conventional classifier is used to classify the estimated incomplete data. The effectiveness of the proposed method is verified by testing different datasets with related methods.