Practical and White-Box Anomaly Detection through Unsupervised and Active Learning

2020 29th International Conference on Computer Communications and Networks (ICCCN) Pub Date : 2020-08-01 DOI:10.1109/ICCCN49398.2020.9209704

Yao Wang, Zhaowei Wang, Zejun Xie, Nengwen Zhao, Junjie Chen, Wenchi Zhang, Kaixin Sui, Dan Pei

{"title":"Practical and White-Box Anomaly Detection through Unsupervised and Active Learning","authors":"Yao Wang, Zhaowei Wang, Zejun Xie, Nengwen Zhao, Junjie Chen, Wenchi Zhang, Kaixin Sui, Dan Pei","doi":"10.1109/ICCCN49398.2020.9209704","DOIUrl":null,"url":null,"abstract":"To ensure quality of service and user experience, large Internet companies often monitor various Key Performance Indicators (KPIs) of their systems so that they can detect anomalies and identify failure in real time. However, due to a large number of various KPIs and the lack of high-quality labels, existing KPI anomaly detection approaches either perform well only on certain types of KPIs or consume excessive resources. Therefore, to realize generic and practical KPI anomaly detection in the real world, we propose a KPI anomaly detection framework named iRRCF-Active, which contains an unsupervised and white-box anomaly detector based on Robust Random Cut Forest (RRCF), and an active learning component. Specifically, we novelly propose an improved RRCF (iRRCF) algorithm to overcome the drawbacks of applying original RRCF in KPI anomaly detection. Besides, we also incorporate the idea of active learning to make our model benefit from high-quality labels given by experienced operators. We conduct extensive experiments on a large-scale public dataset and a private dataset collected from a large commercial bank. The experimental resulta demonstrate that iRRCF-Active performs better than existing traditional statistical methods, unsupervised learning methods and supervised learning methods. Besides, each component in iRRCF-Active has also been demonstrated to be effective and indispensable.","PeriodicalId":137835,"journal":{"name":"2020 29th International Conference on Computer Communications and Networks (ICCCN)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 29th International Conference on Computer Communications and Networks (ICCCN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCN49398.2020.9209704","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

To ensure quality of service and user experience, large Internet companies often monitor various Key Performance Indicators (KPIs) of their systems so that they can detect anomalies and identify failure in real time. However, due to a large number of various KPIs and the lack of high-quality labels, existing KPI anomaly detection approaches either perform well only on certain types of KPIs or consume excessive resources. Therefore, to realize generic and practical KPI anomaly detection in the real world, we propose a KPI anomaly detection framework named iRRCF-Active, which contains an unsupervised and white-box anomaly detector based on Robust Random Cut Forest (RRCF), and an active learning component. Specifically, we novelly propose an improved RRCF (iRRCF) algorithm to overcome the drawbacks of applying original RRCF in KPI anomaly detection. Besides, we also incorporate the idea of active learning to make our model benefit from high-quality labels given by experienced operators. We conduct extensive experiments on a large-scale public dataset and a private dataset collected from a large commercial bank. The experimental resulta demonstrate that iRRCF-Active performs better than existing traditional statistical methods, unsupervised learning methods and supervised learning methods. Besides, each component in iRRCF-Active has also been demonstrated to be effective and indispensable.

查看原文本刊更多论文

基于无监督和主动学习的实用白盒异常检测

为了确保服务质量和用户体验，大型互联网公司经常监控其系统的各种关键绩效指标(kpi)，以便实时发现异常并识别故障。然而，由于各种KPI数量众多，缺乏高质量的标签，现有的KPI异常检测方法要么只能在某些类型的KPI上表现良好，要么消耗过多的资源。因此，为了在现实世界中实现通用和实用的KPI异常检测，我们提出了一种KPI异常检测框架iRRCF-Active，该框架包含一个基于鲁棒随机砍伐森林(RRCF)的无监督白盒异常检测器和一个主动学习组件。针对原有RRCF算法在KPI异常检测中的不足，提出了一种改进的RRCF (iRRCF)算法。此外，我们还融入了主动学习的理念，使我们的模型受益于经验丰富的操作员给出的高质量标签。我们在大型公共数据集和从大型商业银行收集的私人数据集上进行了广泛的实验。实验结果表明，ircf - active算法的性能优于现有的传统统计方法、无监督学习方法和有监督学习方法。此外，iRRCF-Active的各个成分也被证明是有效的和不可或缺的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 29th International Conference on Computer Communications and Networks (ICCCN)

自引率

0.00%

发文量