{"title":"Prototype Selection for k-Nearest Neighbors Classification Using Geometric Median","authors":"Chatchai Kasemtaweechok, W. Suwannik","doi":"10.1145/3033288.3033301","DOIUrl":null,"url":null,"abstract":"The k-Nearest Neighbors classifier (kNN) is a well-known classifier implemented extensively in the data mining research area. The kNN classifier suffers from several drawbacks such as high storage requirements, computational complexity and high sensitivity to noise. Prototype selection is a promising solution for this problem as it reduces the number of data instances. This study proposes Geometric Median Prototype Selection (GMPS) algorithm which is a new efficient method of prototype selection based on the Geometric Median (GM). A set of GMs are selected as the relevant prototypes of the dataset. The selected prototypes form a training set for building a kNN classifier. After creating the classifier, it is tested on a testing set. The performance is measured in terms of accuracy, kappa and processing time and compared with seven state-of-the-art methods on nine standard datasets. The result shows that GMPS methods provide better performance in accuracy, kappa than all considered PS methods while proposed methods are at least 3.5 times faster than other PS methods and 5.5 times faster than 1NN baseline model. However, the proposed classifiers lost to the baseline classifier about 2 percent of accuracy rate and 0.05 of Cohen's kappa statistics.","PeriodicalId":253625,"journal":{"name":"International Conference on Network, Communication and Computing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Network, Communication and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3033288.3033301","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The k-Nearest Neighbors classifier (kNN) is a well-known classifier implemented extensively in the data mining research area. The kNN classifier suffers from several drawbacks such as high storage requirements, computational complexity and high sensitivity to noise. Prototype selection is a promising solution for this problem as it reduces the number of data instances. This study proposes Geometric Median Prototype Selection (GMPS) algorithm which is a new efficient method of prototype selection based on the Geometric Median (GM). A set of GMs are selected as the relevant prototypes of the dataset. The selected prototypes form a training set for building a kNN classifier. After creating the classifier, it is tested on a testing set. The performance is measured in terms of accuracy, kappa and processing time and compared with seven state-of-the-art methods on nine standard datasets. The result shows that GMPS methods provide better performance in accuracy, kappa than all considered PS methods while proposed methods are at least 3.5 times faster than other PS methods and 5.5 times faster than 1NN baseline model. However, the proposed classifiers lost to the baseline classifier about 2 percent of accuracy rate and 0.05 of Cohen's kappa statistics.