An Approach for Ranking Feature-based Clustering Methods and its Application in Multi-System Infrastructure Monitoring

Andreas Schörgenhumer, T. Natschläger, P. Grünbacher, Mario Kahlhofer, Peter Chalupar, H. Mössenböck
{"title":"An Approach for Ranking Feature-based Clustering Methods and its Application in Multi-System Infrastructure Monitoring","authors":"Andreas Schörgenhumer, T. Natschläger, P. Grünbacher, Mario Kahlhofer, Peter Chalupar, H. Mössenböck","doi":"10.1109/SEAA53835.2021.00031","DOIUrl":null,"url":null,"abstract":"Companies need to collect and analyze time series data to continuously monitor the behavior of software systems during operation, which can in turn be used for performance monitoring, anomaly detection or identifying problems after system crashes. However, gaining insights into common data patterns in time series is challenging, in particular, when analyzing data concerning different properties and from multiple systems. Clustering approaches have been hardly studied in the context of monitoring data, despite their possible benefits. In this paper, we present a feature-based approach to identify clusters in unlabeled infrastructure monitoring data collected from multiple independent software systems. We introduce time series properties which are grouped into feature sets and combine them with various unsupervised machine learning models to find the methods best suited for our clustering goal. We thoroughly evaluate our approach using two large-scale, industrial monitoring datasets. Finally, we apply one of the top-ranked methods to thousands of time series from hundreds of software systems, thereby showing the usefulness of our approach.","PeriodicalId":435977,"journal":{"name":"2021 47th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 47th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SEAA53835.2021.00031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Companies need to collect and analyze time series data to continuously monitor the behavior of software systems during operation, which can in turn be used for performance monitoring, anomaly detection or identifying problems after system crashes. However, gaining insights into common data patterns in time series is challenging, in particular, when analyzing data concerning different properties and from multiple systems. Clustering approaches have been hardly studied in the context of monitoring data, despite their possible benefits. In this paper, we present a feature-based approach to identify clusters in unlabeled infrastructure monitoring data collected from multiple independent software systems. We introduce time series properties which are grouped into feature sets and combine them with various unsupervised machine learning models to find the methods best suited for our clustering goal. We thoroughly evaluate our approach using two large-scale, industrial monitoring datasets. Finally, we apply one of the top-ranked methods to thousands of time series from hundreds of software systems, thereby showing the usefulness of our approach.
基于特征的聚类排序方法及其在多系统基础设施监控中的应用
公司需要收集和分析时间序列数据,以持续监控软件系统在运行期间的行为,这些数据反过来可用于性能监控、异常检测或系统崩溃后识别问题。然而,深入了解时间序列中的常见数据模式是具有挑战性的,特别是在分析涉及不同属性和来自多个系统的数据时。尽管聚类方法可能有好处,但它们在监测数据的背景下几乎没有研究过。在本文中,我们提出了一种基于特征的方法来识别从多个独立软件系统收集的未标记基础设施监控数据中的集群。我们引入时间序列属性,将其分组为特征集,并将其与各种无监督机器学习模型相结合,以找到最适合我们聚类目标的方法。我们使用两个大规模的工业监测数据集彻底评估了我们的方法。最后,我们将排名靠前的方法之一应用于来自数百个软件系统的数千个时间序列,从而显示了我们方法的有用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信