通过空间池改进日志异常检测:将 SPClassifier 与集合方法相结合

Hironori Uchida , Keitaro Tominaga , Hideki Itai , Yujie Li , Yoshihisa Nakatoh
{"title":"通过空间池改进日志异常检测:将 SPClassifier 与集合方法相结合","authors":"Hironori Uchida ,&nbsp;Keitaro Tominaga ,&nbsp;Hideki Itai ,&nbsp;Yujie Li ,&nbsp;Yoshihisa Nakatoh","doi":"10.1016/j.cogr.2024.10.001","DOIUrl":null,"url":null,"abstract":"<div><div>In the ever-updating field of software development, new bugs emerge daily, requiring significant time for analysis. As a result, research is being conducted on automating bug resolution using techniques such as anomaly detection through deep learning applied to text logs. This study focuses on anomaly detection using text logs and aims to address current challenges. Specifically, we aim to improve the accuracy of SPClassifier, a robust and lightweight AI model capable of handling dynamic log datasets through ad-hoc learning. We employ three ensemble learning methods to enhance the accuracy of SPClassifier. The method that achieved the greatest improvement was Improved Bagging, which combines the non-overlapping sampling of Pasting with the overlapping sampling of Bagging, resulting in a maximum F1-score improvement of 155 %. Additionally, on certain datasets, the F1-score surpassed that of well-known DNN methods by 130 %. Furthermore, the proposed method demonstrated lower variance compared to DNN methods, indicating its advantage, particularly in environments where datasets frequently fluctuate, such as development fields. These results highlight the clear superiority of the proposed method, which is lightweight in terms of computational resources and supports ad-hoc learning.</div></div>","PeriodicalId":100288,"journal":{"name":"Cognitive Robotics","volume":"4 ","pages":"Pages 217-227"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving log anomaly detection via spatial pooling: Combining SPClassifier with ensemble method\",\"authors\":\"Hironori Uchida ,&nbsp;Keitaro Tominaga ,&nbsp;Hideki Itai ,&nbsp;Yujie Li ,&nbsp;Yoshihisa Nakatoh\",\"doi\":\"10.1016/j.cogr.2024.10.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In the ever-updating field of software development, new bugs emerge daily, requiring significant time for analysis. As a result, research is being conducted on automating bug resolution using techniques such as anomaly detection through deep learning applied to text logs. This study focuses on anomaly detection using text logs and aims to address current challenges. Specifically, we aim to improve the accuracy of SPClassifier, a robust and lightweight AI model capable of handling dynamic log datasets through ad-hoc learning. We employ three ensemble learning methods to enhance the accuracy of SPClassifier. The method that achieved the greatest improvement was Improved Bagging, which combines the non-overlapping sampling of Pasting with the overlapping sampling of Bagging, resulting in a maximum F1-score improvement of 155 %. Additionally, on certain datasets, the F1-score surpassed that of well-known DNN methods by 130 %. Furthermore, the proposed method demonstrated lower variance compared to DNN methods, indicating its advantage, particularly in environments where datasets frequently fluctuate, such as development fields. These results highlight the clear superiority of the proposed method, which is lightweight in terms of computational resources and supports ad-hoc learning.</div></div>\",\"PeriodicalId\":100288,\"journal\":{\"name\":\"Cognitive Robotics\",\"volume\":\"4 \",\"pages\":\"Pages 217-227\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cognitive Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667241324000132\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Robotics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667241324000132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在不断更新的软件开发领域,每天都会出现新的错误,需要大量时间进行分析。因此,人们正在研究如何利用深度学习对文本日志进行异常检测等技术来自动解决错误。本研究侧重于使用文本日志进行异常检测,旨在应对当前的挑战。具体来说,我们的目标是提高 SPClassifier 的准确性,这是一种稳健、轻量级的人工智能模型,能够通过临时学习处理动态日志数据集。我们采用了三种集合学习方法来提高 SPClassifier 的准确性。改进型 Bagging 是提高幅度最大的方法,它结合了 Pasting 的非重叠采样和 Bagging 的重叠采样,使 F1 分数提高了 155%。此外,在某些数据集上,F1 分数比著名的 DNN 方法高出 130%。此外,与 DNN 方法相比,所提出的方法显示出更低的方差,这表明了它的优势,尤其是在数据集经常波动的环境中,如开发领域。这些结果凸显了所提方法的明显优势,因为它在计算资源方面非常轻便,而且支持临时学习。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improving log anomaly detection via spatial pooling: Combining SPClassifier with ensemble method
In the ever-updating field of software development, new bugs emerge daily, requiring significant time for analysis. As a result, research is being conducted on automating bug resolution using techniques such as anomaly detection through deep learning applied to text logs. This study focuses on anomaly detection using text logs and aims to address current challenges. Specifically, we aim to improve the accuracy of SPClassifier, a robust and lightweight AI model capable of handling dynamic log datasets through ad-hoc learning. We employ three ensemble learning methods to enhance the accuracy of SPClassifier. The method that achieved the greatest improvement was Improved Bagging, which combines the non-overlapping sampling of Pasting with the overlapping sampling of Bagging, resulting in a maximum F1-score improvement of 155 %. Additionally, on certain datasets, the F1-score surpassed that of well-known DNN methods by 130 %. Furthermore, the proposed method demonstrated lower variance compared to DNN methods, indicating its advantage, particularly in environments where datasets frequently fluctuate, such as development fields. These results highlight the clear superiority of the proposed method, which is lightweight in terms of computational resources and supports ad-hoc learning.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.40
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信