贝叶斯异常检测与分类

E. Roberts, B. Bassett, M. Lochner
{"title":"贝叶斯异常检测与分类","authors":"E. Roberts, B. Bassett, M. Lochner","doi":"10.3233/his-200282","DOIUrl":null,"url":null,"abstract":"Statistical uncertainties are rarely incorporated into machine learning algorithms, especially for anomaly detection. Here we present the Bayesian Anomaly Detection And Classification (BADAC) formalism, which provides a unified statistical approach to classification and anomaly detection within a hierarchical Bayesian framework. BADAC deals with uncertainties by marginalising over the unknown, true, value of the data. Using simulated data with Gaussian noise as an example, BADAC is shown to be superior to standard algorithms in both classification and anomaly detection performance in the presence of uncertainties. Additionally, BADAC provides well-calibrated classification probabilities, valuable for use in scientific pipelines. We show that BADAC can work in online mode and is fairly robust to model errors, which can be diagnosed through model-selection methods. In addition it can perform unsupervised new class detection and can naturally be extended to search for anomalous subsets of data. BADAC is therefore ideal where computational cost is not a limiting factor and statistical rigour is important. We discuss approximations to speed up BADAC, such as the use of Gaussian processes, and finally introduce a new metric, the Rank-Weighted Score (RWS), that is particularly suited to evaluating an algorithm’s ability to detect anomalies.","PeriodicalId":88526,"journal":{"name":"International journal of hybrid intelligent systems","volume":"1 1","pages":"426-435"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3233/his-200282","citationCount":"10","resultStr":"{\"title\":\"Bayesian Anomaly Detection and Classification\",\"authors\":\"E. Roberts, B. Bassett, M. Lochner\",\"doi\":\"10.3233/his-200282\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Statistical uncertainties are rarely incorporated into machine learning algorithms, especially for anomaly detection. Here we present the Bayesian Anomaly Detection And Classification (BADAC) formalism, which provides a unified statistical approach to classification and anomaly detection within a hierarchical Bayesian framework. BADAC deals with uncertainties by marginalising over the unknown, true, value of the data. Using simulated data with Gaussian noise as an example, BADAC is shown to be superior to standard algorithms in both classification and anomaly detection performance in the presence of uncertainties. Additionally, BADAC provides well-calibrated classification probabilities, valuable for use in scientific pipelines. We show that BADAC can work in online mode and is fairly robust to model errors, which can be diagnosed through model-selection methods. In addition it can perform unsupervised new class detection and can naturally be extended to search for anomalous subsets of data. BADAC is therefore ideal where computational cost is not a limiting factor and statistical rigour is important. We discuss approximations to speed up BADAC, such as the use of Gaussian processes, and finally introduce a new metric, the Rank-Weighted Score (RWS), that is particularly suited to evaluating an algorithm’s ability to detect anomalies.\",\"PeriodicalId\":88526,\"journal\":{\"name\":\"International journal of hybrid intelligent systems\",\"volume\":\"1 1\",\"pages\":\"426-435\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.3233/his-200282\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of hybrid intelligent systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/his-200282\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of hybrid intelligent systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/his-200282","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

统计不确定性很少被纳入机器学习算法中,尤其是在异常检测中。在这里,我们提出了贝叶斯异常检测和分类(BADAC)形式,它在分层贝叶斯框架内为分类和异常检测提供了统一的统计方法。BADAC通过边缘化未知、真实的数据价值来处理不确定性。以高斯噪声模拟数据为例,在存在不确定性的情况下,BADAC在分类和异常检测性能方面都优于标准算法。此外,BADAC提供了经过良好校准的分类概率,对科学管道的使用很有价值。我们证明了BADAC可以在在线模式下工作,并且对模型误差具有相当的鲁棒性,可以通过模型选择方法进行诊断。此外,它可以执行无监督的新类别检测,并且可以自然地扩展到搜索数据的异常子集。因此,在计算成本不是限制因素并且统计严格性很重要的情况下,BADAC是理想的。我们讨论了加速BADAC的近似方法,例如使用高斯过程,最后引入了一种新的度量,即秩加权分数(RWS),它特别适合于评估算法检测异常的能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Bayesian Anomaly Detection and Classification
Statistical uncertainties are rarely incorporated into machine learning algorithms, especially for anomaly detection. Here we present the Bayesian Anomaly Detection And Classification (BADAC) formalism, which provides a unified statistical approach to classification and anomaly detection within a hierarchical Bayesian framework. BADAC deals with uncertainties by marginalising over the unknown, true, value of the data. Using simulated data with Gaussian noise as an example, BADAC is shown to be superior to standard algorithms in both classification and anomaly detection performance in the presence of uncertainties. Additionally, BADAC provides well-calibrated classification probabilities, valuable for use in scientific pipelines. We show that BADAC can work in online mode and is fairly robust to model errors, which can be diagnosed through model-selection methods. In addition it can perform unsupervised new class detection and can naturally be extended to search for anomalous subsets of data. BADAC is therefore ideal where computational cost is not a limiting factor and statistical rigour is important. We discuss approximations to speed up BADAC, such as the use of Gaussian processes, and finally introduce a new metric, the Rank-Weighted Score (RWS), that is particularly suited to evaluating an algorithm’s ability to detect anomalies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.30
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信