Shrijita Bhattacharya, Francois Kamper, J. Beirlant
{"title":"基于极值理论的异常值检测及其应用","authors":"Shrijita Bhattacharya, Francois Kamper, J. Beirlant","doi":"10.1111/sjos.12665","DOIUrl":null,"url":null,"abstract":"Whether an extreme observation is an outlier or not depends strongly on the corresponding tail behavior of the underlying distribution. We develop an automatic, data‐driven method rooted in the mathematical theory of extremes to identify observations that deviate from the intermediate and central characteristics. The proposed algorithm is an extension of a method previously proposed in the literature for the specific case of heavy tailed Pareto‐type distributions to all max‐domains of attraction. We propose some applications such as a tail‐adjusted boxplot which yields a more accurate representation of possible outliers, and the identification of outliers in a multivariate context through an analysis of associated random variables such as local outlier factors. Several examples and simulation results illustrate the finite sample behavior of the algorithm and its applications.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"50 1","pages":"1466 - 1502"},"PeriodicalIF":0.8000,"publicationDate":"2023-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Outlier detection based on extreme value theory and applications\",\"authors\":\"Shrijita Bhattacharya, Francois Kamper, J. Beirlant\",\"doi\":\"10.1111/sjos.12665\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Whether an extreme observation is an outlier or not depends strongly on the corresponding tail behavior of the underlying distribution. We develop an automatic, data‐driven method rooted in the mathematical theory of extremes to identify observations that deviate from the intermediate and central characteristics. The proposed algorithm is an extension of a method previously proposed in the literature for the specific case of heavy tailed Pareto‐type distributions to all max‐domains of attraction. We propose some applications such as a tail‐adjusted boxplot which yields a more accurate representation of possible outliers, and the identification of outliers in a multivariate context through an analysis of associated random variables such as local outlier factors. Several examples and simulation results illustrate the finite sample behavior of the algorithm and its applications.\",\"PeriodicalId\":49567,\"journal\":{\"name\":\"Scandinavian Journal of Statistics\",\"volume\":\"50 1\",\"pages\":\"1466 - 1502\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2023-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scandinavian Journal of Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1111/sjos.12665\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scandinavian Journal of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1111/sjos.12665","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Outlier detection based on extreme value theory and applications
Whether an extreme observation is an outlier or not depends strongly on the corresponding tail behavior of the underlying distribution. We develop an automatic, data‐driven method rooted in the mathematical theory of extremes to identify observations that deviate from the intermediate and central characteristics. The proposed algorithm is an extension of a method previously proposed in the literature for the specific case of heavy tailed Pareto‐type distributions to all max‐domains of attraction. We propose some applications such as a tail‐adjusted boxplot which yields a more accurate representation of possible outliers, and the identification of outliers in a multivariate context through an analysis of associated random variables such as local outlier factors. Several examples and simulation results illustrate the finite sample behavior of the algorithm and its applications.
期刊介绍:
The Scandinavian Journal of Statistics is internationally recognised as one of the leading statistical journals in the world. It was founded in 1974 by four Scandinavian statistical societies. Today more than eighty per cent of the manuscripts are submitted from outside Scandinavia.
It is an international journal devoted to reporting significant and innovative original contributions to statistical methodology, both theory and applications.
The journal specializes in statistical modelling showing particular appreciation of the underlying substantive research problems.
The emergence of specialized methods for analysing longitudinal and spatial data is just one example of an area of important methodological development in which the Scandinavian Journal of Statistics has a particular niche.