Log anomaly detection in AIOps: A real-world implementation using Large Language Models

IF 3.6

Systems and Soft Computing Pub Date : 2026-06-01 Epub Date: 2026-03-05 DOI:10.1016/j.sasc.2026.200475

Miguel De la Cruz Cabello , Tiago Prince Sales , Marcos R. Machado

{"title":"Log anomaly detection in AIOps: A real-world implementation using Large Language Models","authors":"Miguel De la Cruz Cabello , Tiago Prince Sales , Marcos R. Machado","doi":"10.1016/j.sasc.2026.200475","DOIUrl":null,"url":null,"abstract":"<div><div>This study investigates the application of Large Language Models (LLMs) for log anomaly detection within the emerging field of AIOps, where large-scale operational logs are increasingly used to support reliability engineering and automated incident response. However, deploying LLM-based anomaly detection in military environments raises practical constraints, including strict data confidentiality, limited data sharing, and frequent shifts in operational conditions and log formats. To address these challenges, we design and implement a self-supervised anomaly detection framework based on LogBERT, trained only on normal Linux syslog sequences, and deploy it locally to avoid external dependencies. We explore critical parameters, including the minimum number of tokens per log sequence, sliding window intervals, and mask ratios while attempting to detect log anomaly. In controlled experiments, a 15-second sliding window with a 10-second overlap provided the best trade-off between detection effectiveness and inference latency, supporting real-time monitoring requirements. Quantitative evaluation demonstrates that shorter sliding windows, despite capturing less context, resulted in slightly higher detection performance of anomalous logs. The model achieved high accuracy in distinguishing normal from abnormal log sequences, where sequences were classified as anomalous if more than 10% of masked tokens were incorrectly predicted. A qualitative assessment with domain experts further validated the operational usefulness of the approach, indicating reduced manual monitoring effort and suitability for integration into AIOps pipelines under confidentiality constraints.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"8 ","pages":"Article 200475"},"PeriodicalIF":3.6000,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772941926000384","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/3/5 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This study investigates the application of Large Language Models (LLMs) for log anomaly detection within the emerging field of AIOps, where large-scale operational logs are increasingly used to support reliability engineering and automated incident response. However, deploying LLM-based anomaly detection in military environments raises practical constraints, including strict data confidentiality, limited data sharing, and frequent shifts in operational conditions and log formats. To address these challenges, we design and implement a self-supervised anomaly detection framework based on LogBERT, trained only on normal Linux syslog sequences, and deploy it locally to avoid external dependencies. We explore critical parameters, including the minimum number of tokens per log sequence, sliding window intervals, and mask ratios while attempting to detect log anomaly. In controlled experiments, a 15-second sliding window with a 10-second overlap provided the best trade-off between detection effectiveness and inference latency, supporting real-time monitoring requirements. Quantitative evaluation demonstrates that shorter sliding windows, despite capturing less context, resulted in slightly higher detection performance of anomalous logs. The model achieved high accuracy in distinguishing normal from abnormal log sequences, where sequences were classified as anomalous if more than 10% of masked tokens were incorrectly predicted. A qualitative assessment with domain experts further validated the operational usefulness of the approach, indicating reduced manual monitoring effort and suitability for integration into AIOps pipelines under confidentiality constraints.

查看原文本刊更多论文

AIOps中的日志异常检测：使用大型语言模型的现实世界实现

本研究探讨了大型语言模型（llm）在AIOps新兴领域日志异常检测中的应用，在AIOps中，大规模操作日志越来越多地用于支持可靠性工程和自动化事件响应。然而，在军事环境中部署基于llm的异常检测会带来实际的限制，包括严格的数据保密性、有限的数据共享以及操作条件和日志格式的频繁变化。为了解决这些挑战，我们设计并实现了一个基于LogBERT的自我监督异常检测框架，仅在正常的Linux syslog序列上进行训练，并将其部署在本地以避免外部依赖。我们探索了关键参数，包括每个日志序列的最小令牌数量、滑动窗口间隔和掩码比率，同时试图检测日志异常。在对照实验中，15秒滑动窗口与10秒重叠提供了检测效率和推理延迟之间的最佳权衡，支持实时监控需求。定量评价表明，较短的滑动窗口尽管捕获的上下文较少，但对异常日志的检测性能略高。该模型在区分正常和异常日志序列方面取得了很高的准确性，如果超过10%的掩码标记被错误预测，则将序列分类为异常。与领域专家进行的定性评估进一步验证了该方法的操作有效性，表明减少了人工监控工作，并且在保密约束下适合集成到AIOps管道中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Systems and Soft Computing

CiteScore

2.20

自引率

0.00%

发文量