An improvement to k-nearest neighbor classifier

2011 IEEE Recent Advances in Intelligent Computational Systems Pub Date : 2011-11-03 DOI:10.1109/RAICS.2011.6069307

P. Viswanath, T. Hitendra Sarma

{"title":"An improvement to k-nearest neighbor classifier","authors":"P. Viswanath, T. Hitendra Sarma","doi":"10.1109/RAICS.2011.6069307","DOIUrl":null,"url":null,"abstract":"Non-parametric methods like Nearest neighbor classifier (NNC) and its variants such as k-nearest neighbor classifier (k-NNC) are simple to use and often shows good performance in practice. It stores all training patterns and searches to find k nearest neighbors of the given test pattern. Some fundamental improvements to k-NNC are (i) weighted k-nearest neighbor classifier (wk-NNC) where a weight to each of the neighbors is given and is used in the classification, (ii) to use a bootstrapped training set instead of the given training set, etc. Hamamoto et. al. [1] has given a bootstrapping method, where a training pattern is replaced by a weighted mean of a few of its neighbors from its own class of training patterns. It is shown to improve the classification accuracy in most of the cases. The time to create the bootstrapped set is O(n2) where n is the number of training patterns. This paper presents a novel improvement to the k-NNC called k-Nearest Neighbor Mean Classifier (k-NNMC). k-NNMC finds k nearest neighbors for each class of training patterns separately, and finds means for each of these k neighbors (class-wise). Classification is done according to the nearest mean pattern. It is shown experimentally using several standard data-sets that the proposed classifier shows better classification accuracy over k-NNC, wk-NNC and k-NNC using Hamamoto's bootstrapped training set. Further, the proposed method does not have a design phase as the Hamamoto's method, and this is suitable for parallel implementations which can be coupled with any indexing and space reduction methods easily. It is a suitable method to be used in data mining applications.","PeriodicalId":394515,"journal":{"name":"2011 IEEE Recent Advances in Intelligent Computational Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Recent Advances in Intelligent Computational Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RAICS.2011.6069307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Non-parametric methods like Nearest neighbor classifier (NNC) and its variants such as k-nearest neighbor classifier (k-NNC) are simple to use and often shows good performance in practice. It stores all training patterns and searches to find k nearest neighbors of the given test pattern. Some fundamental improvements to k-NNC are (i) weighted k-nearest neighbor classifier (wk-NNC) where a weight to each of the neighbors is given and is used in the classification, (ii) to use a bootstrapped training set instead of the given training set, etc. Hamamoto et. al. [1] has given a bootstrapping method, where a training pattern is replaced by a weighted mean of a few of its neighbors from its own class of training patterns. It is shown to improve the classification accuracy in most of the cases. The time to create the bootstrapped set is O(n2) where n is the number of training patterns. This paper presents a novel improvement to the k-NNC called k-Nearest Neighbor Mean Classifier (k-NNMC). k-NNMC finds k nearest neighbors for each class of training patterns separately, and finds means for each of these k neighbors (class-wise). Classification is done according to the nearest mean pattern. It is shown experimentally using several standard data-sets that the proposed classifier shows better classification accuracy over k-NNC, wk-NNC and k-NNC using Hamamoto's bootstrapped training set. Further, the proposed method does not have a design phase as the Hamamoto's method, and this is suitable for parallel implementations which can be coupled with any indexing and space reduction methods easily. It is a suitable method to be used in data mining applications.

查看原文本刊更多论文

对k近邻分类器的改进

非参数方法，如最近邻分类器(NNC)及其变体，如k-最近邻分类器(k-NNC)，使用简单，在实践中往往表现出良好的性能。它存储所有训练模式，并搜索给定测试模式的k个最近邻居。k-NNC的一些基本改进是(i)加权k近邻分类器(wk-NNC)，其中给出每个邻居的权重并用于分类，(ii)使用自举训练集代替给定训练集，等等。Hamamoto等人[1]给出了一种自举方法，其中训练模式由其自己的训练模式类中几个相邻模式的加权平均值代替。在大多数情况下，该方法都能提高分类精度。创建引导集的时间是O(n2)，其中n是训练模式的数量。本文提出了k-NNC的一种新的改进，称为k-最近邻均值分类器(k-NNMC)。k- nnmc分别为每个训练模式类找到k个最近的邻居，并为这k个邻居中的每一个找到均值(类相关)。根据最接近的平均模式进行分类。使用几个标准数据集的实验表明，该分类器比使用Hamamoto的自举训练集的k-NNC、wk-NNC和k-NNC具有更好的分类精度。此外，该方法没有Hamamoto方法的设计阶段，适合于并行实现，可以与任何索引和空间缩减方法轻松耦合。它是一种适用于数据挖掘应用的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE Recent Advances in Intelligent Computational Systems

自引率

0.00%

发文量