An improvement to k-nearest neighbor classifier

P. Viswanath, T. Hitendra Sarma
{"title":"An improvement to k-nearest neighbor classifier","authors":"P. Viswanath, T. Hitendra Sarma","doi":"10.1109/RAICS.2011.6069307","DOIUrl":null,"url":null,"abstract":"Non-parametric methods like Nearest neighbor classifier (NNC) and its variants such as k-nearest neighbor classifier (k-NNC) are simple to use and often shows good performance in practice. It stores all training patterns and searches to find k nearest neighbors of the given test pattern. Some fundamental improvements to k-NNC are (i) weighted k-nearest neighbor classifier (wk-NNC) where a weight to each of the neighbors is given and is used in the classification, (ii) to use a bootstrapped training set instead of the given training set, etc. Hamamoto et. al. [1] has given a bootstrapping method, where a training pattern is replaced by a weighted mean of a few of its neighbors from its own class of training patterns. It is shown to improve the classification accuracy in most of the cases. The time to create the bootstrapped set is O(n2) where n is the number of training patterns. This paper presents a novel improvement to the k-NNC called k-Nearest Neighbor Mean Classifier (k-NNMC). k-NNMC finds k nearest neighbors for each class of training patterns separately, and finds means for each of these k neighbors (class-wise). Classification is done according to the nearest mean pattern. It is shown experimentally using several standard data-sets that the proposed classifier shows better classification accuracy over k-NNC, wk-NNC and k-NNC using Hamamoto's bootstrapped training set. Further, the proposed method does not have a design phase as the Hamamoto's method, and this is suitable for parallel implementations which can be coupled with any indexing and space reduction methods easily. It is a suitable method to be used in data mining applications.","PeriodicalId":394515,"journal":{"name":"2011 IEEE Recent Advances in Intelligent Computational Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Recent Advances in Intelligent Computational Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RAICS.2011.6069307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Non-parametric methods like Nearest neighbor classifier (NNC) and its variants such as k-nearest neighbor classifier (k-NNC) are simple to use and often shows good performance in practice. It stores all training patterns and searches to find k nearest neighbors of the given test pattern. Some fundamental improvements to k-NNC are (i) weighted k-nearest neighbor classifier (wk-NNC) where a weight to each of the neighbors is given and is used in the classification, (ii) to use a bootstrapped training set instead of the given training set, etc. Hamamoto et. al. [1] has given a bootstrapping method, where a training pattern is replaced by a weighted mean of a few of its neighbors from its own class of training patterns. It is shown to improve the classification accuracy in most of the cases. The time to create the bootstrapped set is O(n2) where n is the number of training patterns. This paper presents a novel improvement to the k-NNC called k-Nearest Neighbor Mean Classifier (k-NNMC). k-NNMC finds k nearest neighbors for each class of training patterns separately, and finds means for each of these k neighbors (class-wise). Classification is done according to the nearest mean pattern. It is shown experimentally using several standard data-sets that the proposed classifier shows better classification accuracy over k-NNC, wk-NNC and k-NNC using Hamamoto's bootstrapped training set. Further, the proposed method does not have a design phase as the Hamamoto's method, and this is suitable for parallel implementations which can be coupled with any indexing and space reduction methods easily. It is a suitable method to be used in data mining applications.
对k近邻分类器的改进
非参数方法,如最近邻分类器(NNC)及其变体,如k-最近邻分类器(k-NNC),使用简单,在实践中往往表现出良好的性能。它存储所有训练模式,并搜索给定测试模式的k个最近邻居。k-NNC的一些基本改进是(i)加权k近邻分类器(wk-NNC),其中给出每个邻居的权重并用于分类,(ii)使用自举训练集代替给定训练集,等等。Hamamoto等人[1]给出了一种自举方法,其中训练模式由其自己的训练模式类中几个相邻模式的加权平均值代替。在大多数情况下,该方法都能提高分类精度。创建引导集的时间是O(n2),其中n是训练模式的数量。本文提出了k-NNC的一种新的改进,称为k-最近邻均值分类器(k-NNMC)。k- nnmc分别为每个训练模式类找到k个最近的邻居,并为这k个邻居中的每一个找到均值(类相关)。根据最接近的平均模式进行分类。使用几个标准数据集的实验表明,该分类器比使用Hamamoto的自举训练集的k-NNC、wk-NNC和k-NNC具有更好的分类精度。此外,该方法没有Hamamoto方法的设计阶段,适合于并行实现,可以与任何索引和空间缩减方法轻松耦合。它是一种适用于数据挖掘应用的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信