Improved Search of Relevant Points for Nearest-Neighbor Classification

Alejandro Flores-Velazco
{"title":"Improved Search of Relevant Points for Nearest-Neighbor Classification","authors":"Alejandro Flores-Velazco","doi":"10.48550/arXiv.2203.03567","DOIUrl":null,"url":null,"abstract":"Given a training set $P \\subset \\mathbb{R}^d$, the nearest-neighbor classifier assigns any query point $q \\in \\mathbb{R}^d$ to the class of its closest point in $P$. To answer these classification queries, some training points are more relevant than others. We say a training point is relevant if its omission from the training set could induce the misclassification of some query point in $\\mathbb{R}^d$. These relevant points are commonly known as border points, as they define the boundaries of the Voronoi diagram of $P$ that separate points of different classes. Being able to compute this set of points efficiently is crucial to reduce the size of the training set without affecting the accuracy of the nearest-neighbor classifier. Improving over a decades-long result by Clarkson, in a recent paper by Eppstein an output-sensitive algorithm was proposed to find the set of border points of $P$ in $O( n^2 + nk^2 )$ time, where $k$ is the size of such set. In this paper, we improve this algorithm to have time complexity equal to $O( nk^2 )$ by proving that the first steps of their algorithm, which require $O( n^2 )$ time, are unnecessary.","PeriodicalId":201778,"journal":{"name":"Embedded Systems and Applications","volume":"237 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Embedded Systems and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2203.03567","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Given a training set $P \subset \mathbb{R}^d$, the nearest-neighbor classifier assigns any query point $q \in \mathbb{R}^d$ to the class of its closest point in $P$. To answer these classification queries, some training points are more relevant than others. We say a training point is relevant if its omission from the training set could induce the misclassification of some query point in $\mathbb{R}^d$. These relevant points are commonly known as border points, as they define the boundaries of the Voronoi diagram of $P$ that separate points of different classes. Being able to compute this set of points efficiently is crucial to reduce the size of the training set without affecting the accuracy of the nearest-neighbor classifier. Improving over a decades-long result by Clarkson, in a recent paper by Eppstein an output-sensitive algorithm was proposed to find the set of border points of $P$ in $O( n^2 + nk^2 )$ time, where $k$ is the size of such set. In this paper, we improve this algorithm to have time complexity equal to $O( nk^2 )$ by proving that the first steps of their algorithm, which require $O( n^2 )$ time, are unnecessary.
改进的最近邻分类相关点搜索
给定一个训练集$P \子集$ mathbb{R}^d$,最近邻分类器将$ mathbb{R}^d$中的任何查询点$q \分配给$P$中最近点的类。为了回答这些分类查询,一些训练点比其他训练点更相关。如果从训练集中遗漏一个训练点会导致$\mathbb{R}^d$中的某个查询点的错误分类,我们就说这个训练点是相关的。这些相关点通常被称为边界点,因为它们定义了Voronoi图的边界,将不同类别的点分开。能够有效地计算这组点对于减少训练集的大小而不影响最近邻分类器的准确性至关重要。在最近的一篇论文中,Eppstein提出了一种输出敏感算法,改进了Clarkson长达数十年的结果,该算法可以在$O(n^2 + nk^2)$时间内找到$P$的边界点集合,其中$k$是该集合的大小。在本文中,我们通过证明他们的算法的第一步需要$O(n^2)$时间是不必要的,从而改进了该算法,使其时间复杂度等于$O(nk^2)$。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信