Diminishing Prototype Size for k-Nearest Neighbors Classification

2015 Fourteenth Mexican International Conference on Artificial Intelligence (MICAI) Pub Date : 2015-10-25 DOI:10.1109/MICAI.2015.27

M. Samadpour, H. Parvin, F. Rad

{"title":"Diminishing Prototype Size for k-Nearest Neighbors Classification","authors":"M. Samadpour, H. Parvin, F. Rad","doi":"10.1109/MICAI.2015.27","DOIUrl":null,"url":null,"abstract":"In this paper, a new classification method based on k-Nearest Neighbor (kNN) lazy classifier is proposed. This method leverages the clustering concept to reduce the size of the training set in kNN classifier and also in order to enhance its performance in terms of time complexity. The new approach is called Modified Nearest Neighbor Classifier Based on Clustering (MNNCBC). Inspiring the traditional lazy k-NN algorithm, the main idea is to classify a test instance based on the tags of its k nearest neighbors. In MNNCBC, the training set is first grouped into a small number of partitions. By obtaining a number of partitions employing several runnings of a simple clustering algorithm, MNNCBC algorithm extracts a large number of clusters out of those partitions. Then, a label is assigned to the center of each cluster produced in the previous step. The assignment is determined with use of the majority vote mechanism between the class labels of the patterns in each cluster. MNNCBC algorithm iteratively inserts a cluster into a pool of the selected clusters that are considered as the training set of the final 1-NN classifier as long as the accuracy of 1-NN classifier over a set of patterns included the training set and the validation set improves. The selected set of the most accurate clusters are considered as the training set of proposed 1-NN classifier. After that, the class label of a new test sample is determined according to the class label of the nearest cluster center. While kNN lazy classifier is computationally expensive, MNNCBC classifier reduces its computational complexity by a multiplier of 1/k. So MNNCBC classifier is about k times faster than kNN classifier. MNNCBC is evaluated on some real datasets from UCI repository. Empirical results show that MNNCBC has an excellent improvement in terms of both accuracy and time complexity in comparison with kNN classifier.","PeriodicalId":448255,"journal":{"name":"2015 Fourteenth Mexican International Conference on Artificial Intelligence (MICAI)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Fourteenth Mexican International Conference on Artificial Intelligence (MICAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MICAI.2015.27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

In this paper, a new classification method based on k-Nearest Neighbor (kNN) lazy classifier is proposed. This method leverages the clustering concept to reduce the size of the training set in kNN classifier and also in order to enhance its performance in terms of time complexity. The new approach is called Modified Nearest Neighbor Classifier Based on Clustering (MNNCBC). Inspiring the traditional lazy k-NN algorithm, the main idea is to classify a test instance based on the tags of its k nearest neighbors. In MNNCBC, the training set is first grouped into a small number of partitions. By obtaining a number of partitions employing several runnings of a simple clustering algorithm, MNNCBC algorithm extracts a large number of clusters out of those partitions. Then, a label is assigned to the center of each cluster produced in the previous step. The assignment is determined with use of the majority vote mechanism between the class labels of the patterns in each cluster. MNNCBC algorithm iteratively inserts a cluster into a pool of the selected clusters that are considered as the training set of the final 1-NN classifier as long as the accuracy of 1-NN classifier over a set of patterns included the training set and the validation set improves. The selected set of the most accurate clusters are considered as the training set of proposed 1-NN classifier. After that, the class label of a new test sample is determined according to the class label of the nearest cluster center. While kNN lazy classifier is computationally expensive, MNNCBC classifier reduces its computational complexity by a multiplier of 1/k. So MNNCBC classifier is about k times faster than kNN classifier. MNNCBC is evaluated on some real datasets from UCI repository. Empirical results show that MNNCBC has an excellent improvement in terms of both accuracy and time complexity in comparison with kNN classifier.

查看原文本刊更多论文

k近邻分类的原型尺寸递减

本文提出了一种基于k-最近邻(kNN)惰性分类器的分类方法。该方法利用聚类的概念来减少kNN分类器的训练集的大小，并在时间复杂度方面提高其性能。这种新方法被称为基于聚类的改进最近邻分类器(MNNCBC)。启发传统的懒惰k- nn算法，其主要思想是根据其k个最近邻居的标签对测试实例进行分类。在MNNCBC中，训练集首先被分成几个分区。MNNCBC算法通过多次运行简单的聚类算法获得大量分区，从这些分区中提取大量集群。然后，为上一步生成的每个集群的中心分配一个标签。使用多数投票机制确定每个集群中模式的类标签之间的分配。只要1-NN分类器在包含训练集和验证集的一组模式上的准确率提高，MNNCBC算法就会迭代地将一个聚类插入到被选中的聚类池中，这些聚类被认为是最终1-NN分类器的训练集。选取最准确的聚类作为1-NN分类器的训练集。然后，根据最近的聚类中心的类标确定新测试样本的类标。kNN懒惰分类器的计算开销很大，而MNNCBC分类器将其计算复杂度降低了1/k倍。所以MNNCBC分类器比kNN分类器快k倍。在UCI存储库的一些实际数据集上对MNNCBC进行了评估。实证结果表明，与kNN分类器相比，MNNCBC在准确率和时间复杂度方面都有很好的提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 Fourteenth Mexican International Conference on Artificial Intelligence (MICAI)

自引率

0.00%

发文量