{"title":"基于跟踪的双自编码器微服务系统异常检测","authors":"Junjun Li;Shi Ying;Tiangang Li;Xiangbo Tian","doi":"10.1109/TNSM.2025.3583213","DOIUrl":null,"url":null,"abstract":"Microservice systems have become a popular architecture for modern Web applications owing to their scalability, modularity, and maintainability. However, with the increasing complexity and size of these systems, anomaly detection emerges as a critical task. In this paper, we introduce TraceDAE, a trace-based anomaly detection approach in microservice systems. The approach initially constructs a Service Trace Graph (STG) to depict service invocation relationships and performance metrics, subsequently introducing a dual autoencoder framework. In this framework, the structure autoencoder employs Graph Attention Networks (GAT) to analyze the structure, while the attribute autoencoder leverages the Long Short-Term Memory Network (LSTM) for processing time series data. This approach is capable of effectively identifying Service Response Abnormal and Service Invocation Abnormal. Moreover, the final experimental results on datasets show that TraceDAE is an efficient anomaly detection approach which outperforms the SOTA(State of The Arts) trace-based anomaly detection methods with F1-scores of 0.970 and 0.925, respectively.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"4884-4897"},"PeriodicalIF":5.4000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TraceDAE: Trace-Based Anomaly Detection in Microservice Systems via Dual Autoencoder\",\"authors\":\"Junjun Li;Shi Ying;Tiangang Li;Xiangbo Tian\",\"doi\":\"10.1109/TNSM.2025.3583213\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Microservice systems have become a popular architecture for modern Web applications owing to their scalability, modularity, and maintainability. However, with the increasing complexity and size of these systems, anomaly detection emerges as a critical task. In this paper, we introduce TraceDAE, a trace-based anomaly detection approach in microservice systems. The approach initially constructs a Service Trace Graph (STG) to depict service invocation relationships and performance metrics, subsequently introducing a dual autoencoder framework. In this framework, the structure autoencoder employs Graph Attention Networks (GAT) to analyze the structure, while the attribute autoencoder leverages the Long Short-Term Memory Network (LSTM) for processing time series data. This approach is capable of effectively identifying Service Response Abnormal and Service Invocation Abnormal. Moreover, the final experimental results on datasets show that TraceDAE is an efficient anomaly detection approach which outperforms the SOTA(State of The Arts) trace-based anomaly detection methods with F1-scores of 0.970 and 0.925, respectively.\",\"PeriodicalId\":13423,\"journal\":{\"name\":\"IEEE Transactions on Network and Service Management\",\"volume\":\"22 5\",\"pages\":\"4884-4897\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2025-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Network and Service Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11052877/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Network and Service Management","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11052877/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
由于其可伸缩性、模块化和可维护性,微服务系统已经成为现代Web应用程序的流行体系结构。然而,随着这些系统的复杂性和规模的增加,异常检测成为一项关键任务。本文介绍了一种基于跟踪的微服务系统异常检测方法TraceDAE。该方法最初构建了一个服务跟踪图(Service Trace Graph, STG)来描述服务调用关系和性能指标,随后引入了一个双自编码器框架。在该框架中,结构自编码器使用图注意网络(GAT)来分析结构,而属性自编码器使用长短期记忆网络(LSTM)来处理时间序列数据。该方法能够有效识别服务响应异常和服务调用异常。最后在数据集上的实验结果表明,TraceDAE是一种高效的异常检测方法,其f1分数分别为0.970和0.925,优于基于SOTA(State of the Arts)的异常检测方法。
TraceDAE: Trace-Based Anomaly Detection in Microservice Systems via Dual Autoencoder
Microservice systems have become a popular architecture for modern Web applications owing to their scalability, modularity, and maintainability. However, with the increasing complexity and size of these systems, anomaly detection emerges as a critical task. In this paper, we introduce TraceDAE, a trace-based anomaly detection approach in microservice systems. The approach initially constructs a Service Trace Graph (STG) to depict service invocation relationships and performance metrics, subsequently introducing a dual autoencoder framework. In this framework, the structure autoencoder employs Graph Attention Networks (GAT) to analyze the structure, while the attribute autoencoder leverages the Long Short-Term Memory Network (LSTM) for processing time series data. This approach is capable of effectively identifying Service Response Abnormal and Service Invocation Abnormal. Moreover, the final experimental results on datasets show that TraceDAE is an efficient anomaly detection approach which outperforms the SOTA(State of The Arts) trace-based anomaly detection methods with F1-scores of 0.970 and 0.925, respectively.
期刊介绍:
IEEE Transactions on Network and Service Management will publish (online only) peerreviewed archival quality papers that advance the state-of-the-art and practical applications of network and service management. Theoretical research contributions (presenting new concepts and techniques) and applied contributions (reporting on experiences and experiments with actual systems) will be encouraged. These transactions will focus on the key technical issues related to: Management Models, Architectures and Frameworks; Service Provisioning, Reliability and Quality Assurance; Management Functions; Enabling Technologies; Information and Communication Models; Policies; Applications and Case Studies; Emerging Technologies and Standards.