2-Stage instance selection algorithm for KNN based on Nearest Unlike Neighbors

Chunru Dong, P. Chan, Wing W. Y. Ng, D. Yeung
{"title":"2-Stage instance selection algorithm for KNN based on Nearest Unlike Neighbors","authors":"Chunru Dong, P. Chan, Wing W. Y. Ng, D. Yeung","doi":"10.1109/ICMLC.2010.5581078","DOIUrl":null,"url":null,"abstract":"For the virtues such as simplicity, high generalization capability, and few training cost, the K-Nearest-Neighbor (KNN) classifier is widely used in pattern recognition and machine learning. However, the computation complexity of KNN classifier will become higher when dealing with large data sets classification problem. In consequence, its efficiency will be decreased greatly. This paper proposes a general two-stage training set condensing algorithm for general KNN classifier. First, we identify the noise data points and remove them from the original training set. Second, a general condensed nearest neighbor rule based on the so-called Nearest Unlike Neighbor (NUN) is presented to further eliminate the redundant samples in training set. In order to verify the performance of the proposed method, some numerical experiments are conducted on several UCI benchmark databases.","PeriodicalId":126080,"journal":{"name":"2010 International Conference on Machine Learning and Cybernetics","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Machine Learning and Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC.2010.5581078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

For the virtues such as simplicity, high generalization capability, and few training cost, the K-Nearest-Neighbor (KNN) classifier is widely used in pattern recognition and machine learning. However, the computation complexity of KNN classifier will become higher when dealing with large data sets classification problem. In consequence, its efficiency will be decreased greatly. This paper proposes a general two-stage training set condensing algorithm for general KNN classifier. First, we identify the noise data points and remove them from the original training set. Second, a general condensed nearest neighbor rule based on the so-called Nearest Unlike Neighbor (NUN) is presented to further eliminate the redundant samples in training set. In order to verify the performance of the proposed method, some numerical experiments are conducted on several UCI benchmark databases.
基于最近邻的KNN两阶段实例选择算法
KNN分类器以其简单、泛化能力强、训练成本低等优点,在模式识别和机器学习中得到了广泛的应用。然而,在处理大数据集分类问题时,KNN分类器的计算复杂度会变得更高。因此,其效率将大大降低。针对一般KNN分类器,提出了一种通用的两阶段训练集压缩算法。首先,我们识别噪声数据点并从原始训练集中去除它们。其次,提出了一种基于最近邻的通用精简近邻规则,进一步消除训练集中的冗余样本;为了验证所提方法的性能,在多个UCI基准数据库上进行了数值实验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信