Near Real-Time Service Monitoring Using High-Dimensional Time Series

Shwetabh Khanduja, Vinod Nair, S. Sundararajan, Ameya Raul, Ajesh Babu Shaj, S. Keerthi
{"title":"Near Real-Time Service Monitoring Using High-Dimensional Time Series","authors":"Shwetabh Khanduja, Vinod Nair, S. Sundararajan, Ameya Raul, Ajesh Babu Shaj, S. Keerthi","doi":"10.1109/ICDMW.2015.254","DOIUrl":null,"url":null,"abstract":"We demonstrate a near real-time service monitoring system for detecting and diagnosing issues from high-dimensional time series data. For detection, we have implemented a learning algorithm that constructs a hierarchy of detectors from data. It is scalable, does not require labelled examples of issues for learning, runs in near real-time, and identifles a subset of counter time series as being relevant for a detected issue. For diagnosis, we provide efflcient algorithms as post-detection diagnosis aids to flnd further relevant counter time series at issue times, a SQL-like query language for writing flexible queries that apply these algorithms on the time series data, and a graphical user interface for visualizing the detection and diagnosis results. Our solution has been deployed in production as an end-to-end system for monitoring Microsoft's internal distributed data storage and computing platform consisting of tens of thousands of machines and currently analyses about 12000 counter time series.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2015.254","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

We demonstrate a near real-time service monitoring system for detecting and diagnosing issues from high-dimensional time series data. For detection, we have implemented a learning algorithm that constructs a hierarchy of detectors from data. It is scalable, does not require labelled examples of issues for learning, runs in near real-time, and identifles a subset of counter time series as being relevant for a detected issue. For diagnosis, we provide efflcient algorithms as post-detection diagnosis aids to flnd further relevant counter time series at issue times, a SQL-like query language for writing flexible queries that apply these algorithms on the time series data, and a graphical user interface for visualizing the detection and diagnosis results. Our solution has been deployed in production as an end-to-end system for monitoring Microsoft's internal distributed data storage and computing platform consisting of tens of thousands of machines and currently analyses about 12000 counter time series.
基于高维时间序列的近实时服务监控
我们展示了一个近实时的服务监控系统,用于从高维时间序列数据中检测和诊断问题。对于检测,我们实现了一个学习算法,该算法从数据中构建检测器的层次结构。它是可扩展的,不需要标记的问题示例来学习,在接近实时的情况下运行,并识别与检测到的问题相关的计数器时间序列子集。对于诊断,我们提供了高效的算法作为检测后诊断辅助工具,在发布时间找到进一步相关的计数器时间序列,一种类似sql的查询语言,用于编写灵活的查询,将这些算法应用于时间序列数据,以及用于可视化检测和诊断结果的图形用户界面。我们的解决方案已经部署在生产中,作为一个端到端的系统,用于监控微软内部由数万台机器组成的分布式数据存储和计算平台,目前分析大约12000个计数器时间序列。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信