Outlier detection under interval and fuzzy uncertainty: algorithmic solvability and computational complexity

V. Kreinovich, Praveen Patangay, L. Longpré, S. Starks, Cynthia Campos
{"title":"Outlier detection under interval and fuzzy uncertainty: algorithmic solvability and computational complexity","authors":"V. Kreinovich, Praveen Patangay, L. Longpré, S. Starks, Cynthia Campos","doi":"10.1109/NAFIPS.2003.1226818","DOIUrl":null,"url":null,"abstract":"In many application areas, it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some \"normal\" values x/sub 1/,..., x/sub n/, compute the sample average E, the sample standard variation /spl sigma/, and then mark a value x as an outlier if x is outside the k/sub 0/-sigma interval [E-k/sub 0//spl middot//spl sigma/, E+k/sub 0//spl middot//spl sigma/] (for some pre-selected parameter k/sub 0/). In real life, we often have only interval ranges [x/sub i/, x~/sub i/] for the normal values x/sub 1/,...,x/sub n/. In this case, we only have intervals of possible values for the bounds E-k/sub 0//spl middot//spl sigma/ and E+k/sub 0//spl middot//spl sigma/. We can therefore identify outliers as values that are outside all k/sub 0/-sigma intervals. In this paper, we analyze the computational complexity of these outlier detection problems, and provide efficient algorithms that solve some of these problems (under reasonable conditions). We also provide algorithms that estimate the degree of \"outlier-ness\" of a given value x-measured as the largest value k/sub 0/ for which x is outside the corresponding k/sub 0/-sigma interval.","PeriodicalId":153530,"journal":{"name":"22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003","volume":"86 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAFIPS.2003.1226818","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

In many application areas, it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some "normal" values x/sub 1/,..., x/sub n/, compute the sample average E, the sample standard variation /spl sigma/, and then mark a value x as an outlier if x is outside the k/sub 0/-sigma interval [E-k/sub 0//spl middot//spl sigma/, E+k/sub 0//spl middot//spl sigma/] (for some pre-selected parameter k/sub 0/). In real life, we often have only interval ranges [x/sub i/, x~/sub i/] for the normal values x/sub 1/,...,x/sub n/. In this case, we only have intervals of possible values for the bounds E-k/sub 0//spl middot//spl sigma/ and E+k/sub 0//spl middot//spl sigma/. We can therefore identify outliers as values that are outside all k/sub 0/-sigma intervals. In this paper, we analyze the computational complexity of these outlier detection problems, and provide efficient algorithms that solve some of these problems (under reasonable conditions). We also provide algorithms that estimate the degree of "outlier-ness" of a given value x-measured as the largest value k/sub 0/ for which x is outside the corresponding k/sub 0/-sigma interval.
区间与模糊不确定性下的离群点检测:算法可解性与计算复杂度
在许多应用领域,检测异常值是很重要的。异常值检测的传统工程方法是我们从一些“正常”值x/sub 1/,…, x/sub - n/,计算样本平均值E,样本标准差/spl sigma/,然后将值x标记为异常值,如果x在k/sub - 0/-sigma区间之外[E-k/sub - 0//spl middot//spl sigma/, E+k/sub - 0//spl middot//spl sigma/](对于某些预先选择的参数k/sub - 0/)。在现实生活中,对于正常值x/下标1/,…我们通常只有区间范围[x/下标i/, x~/下标i/]。x / an /。在这种情况下,我们只有边界E-k/sub 0//spl middot//spl sigma/和E+k/sub 0//spl middot//spl sigma/的可能值的区间。因此,我们可以将异常值识别为所有k/sub 0/-sigma区间之外的值。在本文中,我们分析了这些异常点检测问题的计算复杂性,并提供了有效的算法来解决其中的一些问题(在合理的条件下)。我们还提供了估计给定值x的“异常度”程度的算法,该值被测量为x在相应的k/sub 0/-sigma区间之外的最大值k/sub 0/。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信