Outlier detection via localized p-value estimation

Manqi Zhao, Venkatesh Saligrama
{"title":"Outlier detection via localized p-value estimation","authors":"Manqi Zhao, Venkatesh Saligrama","doi":"10.1109/ALLERTON.2009.5394501","DOIUrl":null,"url":null,"abstract":"We propose a novel non-parametric adaptive outlier detection algorithm, called LPE, for high dimensional data based on score functions derived from nearest neighbor graphs on n-point nominal data. Outliers are predicted whenever the score of a test sample falls below α, which is supposed to be the desired false alarm level. The resulting outlier detector is shown to be asymptotically optimal in that it is uniformly most powerful for the specified false alarm level, α, for the case when the density associated with the outliers is a mixture of the nominal and a known density. Our algorithm is computationally efficient, being linear in dimension and quadratic in data size. The whole empirical Receiving Operating Characteristics (ROC) curve can be derived with almost no additional cost based on the estimated score function. It does not require choosing complicated tuning parameters or function approximation classes and it can adapt to local structure such as local change in dimensionality by incorporating the technique of manifold learning. We demonstrate the algorithm on both artificial and real data sets in high dimensional feature spaces.","PeriodicalId":440015,"journal":{"name":"2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2009.5394501","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

We propose a novel non-parametric adaptive outlier detection algorithm, called LPE, for high dimensional data based on score functions derived from nearest neighbor graphs on n-point nominal data. Outliers are predicted whenever the score of a test sample falls below α, which is supposed to be the desired false alarm level. The resulting outlier detector is shown to be asymptotically optimal in that it is uniformly most powerful for the specified false alarm level, α, for the case when the density associated with the outliers is a mixture of the nominal and a known density. Our algorithm is computationally efficient, being linear in dimension and quadratic in data size. The whole empirical Receiving Operating Characteristics (ROC) curve can be derived with almost no additional cost based on the estimated score function. It does not require choosing complicated tuning parameters or function approximation classes and it can adapt to local structure such as local change in dimensionality by incorporating the technique of manifold learning. We demonstrate the algorithm on both artificial and real data sets in high dimensional feature spaces.
通过局部p值估计进行离群值检测
我们提出了一种新的非参数自适应离群值检测算法,称为LPE,该算法基于n点标称数据上最近邻图的分数函数。当测试样本的分数低于α时,就会预测异常值,α被认为是期望的虚警水平。所得到的离群值检测器被证明是渐近最优的,因为当与离群值相关的密度是标称密度和已知密度的混合物时,对于指定的虚警水平α,它是一致最强大的。我们的算法计算效率高,维数是线性的,数据量是二次的。根据估计的评分函数,可以在几乎没有额外成本的情况下推导出整个经验接收工作特征(ROC)曲线。它不需要选择复杂的整定参数或函数逼近类,并且通过结合流形学习技术可以适应局部结构如局部维数变化。我们在高维特征空间的人工数据集和真实数据集上对该算法进行了验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信