Outlier detection under interval and fuzzy uncertainty: algorithmic solvability and computational complexity

22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003 Pub Date : 2003-07-24 DOI:10.1109/NAFIPS.2003.1226818

V. Kreinovich, Praveen Patangay, L. Longpré, S. Starks, Cynthia Campos

{"title":"Outlier detection under interval and fuzzy uncertainty: algorithmic solvability and computational complexity","authors":"V. Kreinovich, Praveen Patangay, L. Longpré, S. Starks, Cynthia Campos","doi":"10.1109/NAFIPS.2003.1226818","DOIUrl":null,"url":null,"abstract":"In many application areas, it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some \"normal\" values x/sub 1/,..., x/sub n/, compute the sample average E, the sample standard variation /spl sigma/, and then mark a value x as an outlier if x is outside the k/sub 0/-sigma interval [E-k/sub 0//spl middot//spl sigma/, E+k/sub 0//spl middot//spl sigma/] (for some pre-selected parameter k/sub 0/). In real life, we often have only interval ranges [x/sub i/, x~/sub i/] for the normal values x/sub 1/,...,x/sub n/. In this case, we only have intervals of possible values for the bounds E-k/sub 0//spl middot//spl sigma/ and E+k/sub 0//spl middot//spl sigma/. We can therefore identify outliers as values that are outside all k/sub 0/-sigma intervals. In this paper, we analyze the computational complexity of these outlier detection problems, and provide efficient algorithms that solve some of these problems (under reasonable conditions). We also provide algorithms that estimate the degree of \"outlier-ness\" of a given value x-measured as the largest value k/sub 0/ for which x is outside the corresponding k/sub 0/-sigma interval.","PeriodicalId":153530,"journal":{"name":"22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003","volume":"86 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAFIPS.2003.1226818","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

In many application areas, it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some "normal" values x/sub 1/,..., x/sub n/, compute the sample average E, the sample standard variation /spl sigma/, and then mark a value x as an outlier if x is outside the k/sub 0/-sigma interval [E-k/sub 0//spl middot//spl sigma/, E+k/sub 0//spl middot//spl sigma/] (for some pre-selected parameter k/sub 0/). In real life, we often have only interval ranges [x/sub i/, x~/sub i/] for the normal values x/sub 1/,...,x/sub n/. In this case, we only have intervals of possible values for the bounds E-k/sub 0//spl middot//spl sigma/ and E+k/sub 0//spl middot//spl sigma/. We can therefore identify outliers as values that are outside all k/sub 0/-sigma intervals. In this paper, we analyze the computational complexity of these outlier detection problems, and provide efficient algorithms that solve some of these problems (under reasonable conditions). We also provide algorithms that estimate the degree of "outlier-ness" of a given value x-measured as the largest value k/sub 0/ for which x is outside the corresponding k/sub 0/-sigma interval.

查看原文本刊更多论文

区间与模糊不确定性下的离群点检测:算法可解性与计算复杂度

在许多应用领域，检测异常值是很重要的。异常值检测的传统工程方法是我们从一些“正常”值x/sub 1/，…， x/sub - n/，计算样本平均值E，样本标准差/spl sigma/，然后将值x标记为异常值，如果x在k/sub - 0/-sigma区间之外[E-k/sub - 0//spl middot//spl sigma/， E+k/sub - 0//spl middot//spl sigma/](对于某些预先选择的参数k/sub - 0/)。在现实生活中，对于正常值x/下标1/，…我们通常只有区间范围[x/下标i/， x~/下标i/]。x / an /。在这种情况下，我们只有边界E-k/sub 0//spl middot//spl sigma/和E+k/sub 0//spl middot//spl sigma/的可能值的区间。因此，我们可以将异常值识别为所有k/sub 0/-sigma区间之外的值。在本文中，我们分析了这些异常点检测问题的计算复杂性，并提供了有效的算法来解决其中的一些问题(在合理的条件下)。我们还提供了估计给定值x的“异常度”程度的算法，该值被测量为x在相应的k/sub 0/-sigma区间之外的最大值k/sub 0/。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003

自引率

0.00%

发文量