数据中心网络中用于交换机故障诊断和预测的Syslog处理

Shenglin Zhang, Weibin Meng, Jiahao Bu, Sen Yang, Y. Liu, Dan Pei, Jun Xu, Yu Chen, Hui Dong, Xianping Qu, Lei Song
{"title":"数据中心网络中用于交换机故障诊断和预测的Syslog处理","authors":"Shenglin Zhang, Weibin Meng, Jiahao Bu, Sen Yang, Y. Liu, Dan Pei, Jun Xu, Yu Chen, Hui Dong, Xianping Qu, Lei Song","doi":"10.1109/IWQoS.2017.7969130","DOIUrl":null,"url":null,"abstract":"Syslogs on switches are a rich source of information for both post-mortem diagnosis and proactive prediction of switch failures in a datacenter network. However, such information can be effectively extracted only through proper processing of syslogs, e.g., using suitable machine learning techniques. A common approach to syslog processing is to extract (i.e., build) templates from historical syslog messages and then match syslog messages to these templates. However, existing template extraction techniques either have low accuracies in learning the “correct” set of templates, or does not support incremental learning in the sense the entire set of templates has to be rebuilt (from processing all historical syslog messages again) when a new template is to be added, which is prohibitively expensive computationally if used for a large datacenter network. To address these two problems, we propose a frequent template tree (FT-tree) model in which frequent combinations of (syslog) words are identified and then used as message templates. FT-tree empirically extracts message templates more accurately than existing approaches, and naturally supports incremental learning. To compare the performance of FT-tree and three other template learning techniques, we experimented them on two-years' worth of failure tickets and syslogs collected from switches deployed across 10+ datacenters of a tier-1 cloud service provider. The experiments demonstrated that FT-tree improved the estimation/prediction accuracy (as measured by F1) by 155% to 188%, and the computational efficiency by 117 to 730 times.","PeriodicalId":422861,"journal":{"name":"2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"69","resultStr":"{\"title\":\"Syslog processing for switch failure diagnosis and prediction in datacenter networks\",\"authors\":\"Shenglin Zhang, Weibin Meng, Jiahao Bu, Sen Yang, Y. Liu, Dan Pei, Jun Xu, Yu Chen, Hui Dong, Xianping Qu, Lei Song\",\"doi\":\"10.1109/IWQoS.2017.7969130\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Syslogs on switches are a rich source of information for both post-mortem diagnosis and proactive prediction of switch failures in a datacenter network. However, such information can be effectively extracted only through proper processing of syslogs, e.g., using suitable machine learning techniques. A common approach to syslog processing is to extract (i.e., build) templates from historical syslog messages and then match syslog messages to these templates. However, existing template extraction techniques either have low accuracies in learning the “correct” set of templates, or does not support incremental learning in the sense the entire set of templates has to be rebuilt (from processing all historical syslog messages again) when a new template is to be added, which is prohibitively expensive computationally if used for a large datacenter network. To address these two problems, we propose a frequent template tree (FT-tree) model in which frequent combinations of (syslog) words are identified and then used as message templates. FT-tree empirically extracts message templates more accurately than existing approaches, and naturally supports incremental learning. To compare the performance of FT-tree and three other template learning techniques, we experimented them on two-years' worth of failure tickets and syslogs collected from switches deployed across 10+ datacenters of a tier-1 cloud service provider. The experiments demonstrated that FT-tree improved the estimation/prediction accuracy (as measured by F1) by 155% to 188%, and the computational efficiency by 117 to 730 times.\",\"PeriodicalId\":422861,\"journal\":{\"name\":\"2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"69\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IWQoS.2017.7969130\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWQoS.2017.7969130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 69

摘要

交换机上的syslog日志是数据中心网络中交换机故障事后诊断和主动预测的丰富信息源。然而,这些信息只能通过适当的syslog处理来有效地提取,例如,使用合适的机器学习技术。syslog处理的一种常用方法是从历史syslog消息中提取(即构建)模板,然后将syslog消息与这些模板进行匹配。然而,现有的模板提取技术要么在学习“正确的”模板集方面精度较低,要么不支持增量学习,因为当要添加新模板时,必须重新构建整个模板集(再次处理所有历史syslog消息),如果用于大型数据中心网络,这在计算上是非常昂贵的。为了解决这两个问题,我们提出了一个频繁模板树(FT-tree)模型,在该模型中,(syslog)单词的频繁组合被识别出来,然后用作消息模板。FT-tree在经验上比现有方法更准确地提取消息模板,并且自然支持增量学习。为了比较FT-tree和其他三种模板学习技术的性能,我们对从部署在一级云服务提供商的10多个数据中心的交换机上收集的两年的故障票据和syslog日志进行了实验。实验表明,FT-tree将估计/预测精度(以F1衡量)提高了155% ~ 188%,计算效率提高了117 ~ 730倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Syslog processing for switch failure diagnosis and prediction in datacenter networks
Syslogs on switches are a rich source of information for both post-mortem diagnosis and proactive prediction of switch failures in a datacenter network. However, such information can be effectively extracted only through proper processing of syslogs, e.g., using suitable machine learning techniques. A common approach to syslog processing is to extract (i.e., build) templates from historical syslog messages and then match syslog messages to these templates. However, existing template extraction techniques either have low accuracies in learning the “correct” set of templates, or does not support incremental learning in the sense the entire set of templates has to be rebuilt (from processing all historical syslog messages again) when a new template is to be added, which is prohibitively expensive computationally if used for a large datacenter network. To address these two problems, we propose a frequent template tree (FT-tree) model in which frequent combinations of (syslog) words are identified and then used as message templates. FT-tree empirically extracts message templates more accurately than existing approaches, and naturally supports incremental learning. To compare the performance of FT-tree and three other template learning techniques, we experimented them on two-years' worth of failure tickets and syslogs collected from switches deployed across 10+ datacenters of a tier-1 cloud service provider. The experiments demonstrated that FT-tree improved the estimation/prediction accuracy (as measured by F1) by 155% to 188%, and the computational efficiency by 117 to 730 times.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信