利用持久性对无效异常的事后抑制使用系统日志

Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, Rohan Arora, Bing Zhou
{"title":"利用持久性对无效异常的事后抑制使用系统日志","authors":"Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, Rohan Arora, Bing Zhou","doi":"10.1145/3510455.3512774","DOIUrl":null,"url":null,"abstract":"The robustness and availability of cloud services are becoming increasingly important as more applications migrate to the cloud. The operations landscape today is more complex, than ever. Site reliability engineers (SREs) are expected to handle more incidents than ever before with shorter service-level agreements (SLAs). By exploiting log, tracing, metric, and network data, Artificial Intelligence for IT Operations (AIOps) enables detection of faults and anomalous issues of services. A wide variety of anomaly detection techniques have been incorporated in various AIOps platforms (e.g. PCA and autoencoder), but they all suffer from false positives. In this paper, we propose an unsupervised approach for persistent anomaly detection on top of the traditional anomaly detection approaches, with the goal of reducing false positives and providing more trustworthy alerting signals. We test our method on both simulated and real-world datasets. Our technique reduces false positive anomalies by at least 28%, resulting in more reliable and trustworthy notifications. CCS CONCEPTS • Computing methodologies $\\rightarrow$ Anomaly detection;. Software and its engineering $\\rightarrow$Maintaining software. ACM Reference Format: Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, and Rohan Arora, Bing Zhou. 2022. Utilizing Persistence for Post Facto Suppression of Invalid Anomalies Using System Logs. In New Ideas and Emerging Results (ICSENIER’22), May 21-29, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3510455.3512774","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Utilizing Persistence for Post Facto Suppression of Invalid Anomalies Using System Logs\",\"authors\":\"Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, Rohan Arora, Bing Zhou\",\"doi\":\"10.1145/3510455.3512774\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The robustness and availability of cloud services are becoming increasingly important as more applications migrate to the cloud. The operations landscape today is more complex, than ever. Site reliability engineers (SREs) are expected to handle more incidents than ever before with shorter service-level agreements (SLAs). By exploiting log, tracing, metric, and network data, Artificial Intelligence for IT Operations (AIOps) enables detection of faults and anomalous issues of services. A wide variety of anomaly detection techniques have been incorporated in various AIOps platforms (e.g. PCA and autoencoder), but they all suffer from false positives. In this paper, we propose an unsupervised approach for persistent anomaly detection on top of the traditional anomaly detection approaches, with the goal of reducing false positives and providing more trustworthy alerting signals. We test our method on both simulated and real-world datasets. Our technique reduces false positive anomalies by at least 28%, resulting in more reliable and trustworthy notifications. CCS CONCEPTS • Computing methodologies $\\\\rightarrow$ Anomaly detection;. Software and its engineering $\\\\rightarrow$Maintaining software. ACM Reference Format: Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, and Rohan Arora, Bing Zhou. 2022. Utilizing Persistence for Post Facto Suppression of Invalid Anomalies Using System Logs. In New Ideas and Emerging Results (ICSENIER’22), May 21-29, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3510455.3512774\",\"PeriodicalId\":416186,\"journal\":{\"name\":\"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)\",\"volume\":\"144 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3510455.3512774\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510455.3512774","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

随着越来越多的应用程序迁移到云端,云服务的健壮性和可用性变得越来越重要。如今的运营环境比以往任何时候都更加复杂。站点可靠性工程师(SREs)被期望用更短的服务水平协议(sla)处理比以往更多的事件。通过利用日志、跟踪、度量和网络数据,用于IT操作的人工智能(AIOps)能够检测服务的故障和异常问题。各种各样的异常检测技术已经被整合到各种AIOps平台中(例如PCA和自动编码器),但它们都存在误报的问题。在本文中,我们在传统异常检测方法的基础上提出了一种无监督的持续异常检测方法,目的是减少误报并提供更可信的报警信号。我们在模拟和现实世界的数据集上测试了我们的方法。我们的技术将假阳性异常减少了至少28%,从而产生更可靠和值得信赖的通知。CCS CONCEPTS•计算方法$\右划$异常检测;软件及其工程$\右右$维护软件。ACM参考格式:Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, and Rohan Arora,周冰,2022。利用持久性对无效异常的事后抑制使用系统日志。《新思想与新成果》(ICSENIER ' 22), 2022年5月21-29日,美国宾夕法尼亚州匹兹堡。ACM,纽约,美国,5页。https://doi.org/10.1145/3510455.3512774
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Utilizing Persistence for Post Facto Suppression of Invalid Anomalies Using System Logs
The robustness and availability of cloud services are becoming increasingly important as more applications migrate to the cloud. The operations landscape today is more complex, than ever. Site reliability engineers (SREs) are expected to handle more incidents than ever before with shorter service-level agreements (SLAs). By exploiting log, tracing, metric, and network data, Artificial Intelligence for IT Operations (AIOps) enables detection of faults and anomalous issues of services. A wide variety of anomaly detection techniques have been incorporated in various AIOps platforms (e.g. PCA and autoencoder), but they all suffer from false positives. In this paper, we propose an unsupervised approach for persistent anomaly detection on top of the traditional anomaly detection approaches, with the goal of reducing false positives and providing more trustworthy alerting signals. We test our method on both simulated and real-world datasets. Our technique reduces false positive anomalies by at least 28%, resulting in more reliable and trustworthy notifications. CCS CONCEPTS • Computing methodologies $\rightarrow$ Anomaly detection;. Software and its engineering $\rightarrow$Maintaining software. ACM Reference Format: Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, and Rohan Arora, Bing Zhou. 2022. Utilizing Persistence for Post Facto Suppression of Invalid Anomalies Using System Logs. In New Ideas and Emerging Results (ICSENIER’22), May 21-29, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3510455.3512774
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信