{"title":"道路网络时间序列中变长异常的挖掘:一个两阶段优化框架","authors":"Hendri Sutrisno , Frederick Kin Hing Phoa","doi":"10.1016/j.asoc.2025.113516","DOIUrl":null,"url":null,"abstract":"<div><div>Detecting variable-length anomalous subsequences in network traffic is challenging due to the absence of fixed temporal patterns. Anomalies may begin at any point, last for unpredictable durations, and exhibit diverse behaviors depending on the context. Without prior knowledge of where or how long an anomaly may occur, any motif in the time series could be considered anomalous. This uncertainty increases the search complexity, as the method must explore many possible subsequences with different lengths and timings. Since labeled anomalies are often unavailable, the problem is framed as an unsupervised discovery task. It also means the methods do the search and validate anomalies without prior training. This issue makes the problem not only computationally challenging but also conceptually difficult. Existing methods often struggle because they rely on exhaustive searches that require heavy computation. Moreover, when spatial–temporal dynamics are considered, such as in road network traffic where anomalies can propagate across different locations with variable delays, the problem becomes even more complex, as the detection method must account for both when and where anomalies occur. To address these challenges, we propose a two–stage optimization framework called <span><math><mrow><mi>M</mi><msub><mrow><mi>P</mi></mrow><mrow><mi>O</mi><mi>P</mi><mi>T</mi></mrow></msub></mrow></math></span>. In the first stage, the matrix profile is applied to signal potential anomaly locations. In the second stage, a metaheuristic optimizer refines the starting point and length of each detected signal. During refinement, Latin hypercube sampling is used to reduce the number of comparisons between candidate signals and neighboring patterns without sacrificing generalization. We validate the proposed framework using network traffic flow data from Taiwan’s freeway system. Experimental results show that <span><math><mrow><mi>M</mi><msub><mrow><mi>P</mi></mrow><mrow><mi>O</mi><mi>P</mi><mi>T</mi></mrow></msub></mrow></math></span> is at least 26 times faster than benchmarking methods while achieving up to 28.5% higher search accuracy, measured based on relative anomaly scores. These results demonstrate the practical applicability and efficiency of our work for detecting complex anomalies in network time series.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113516"},"PeriodicalIF":6.6000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mining variable-length anomalies in road network time series: A two-stage optimization framework\",\"authors\":\"Hendri Sutrisno , Frederick Kin Hing Phoa\",\"doi\":\"10.1016/j.asoc.2025.113516\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Detecting variable-length anomalous subsequences in network traffic is challenging due to the absence of fixed temporal patterns. Anomalies may begin at any point, last for unpredictable durations, and exhibit diverse behaviors depending on the context. Without prior knowledge of where or how long an anomaly may occur, any motif in the time series could be considered anomalous. This uncertainty increases the search complexity, as the method must explore many possible subsequences with different lengths and timings. Since labeled anomalies are often unavailable, the problem is framed as an unsupervised discovery task. It also means the methods do the search and validate anomalies without prior training. This issue makes the problem not only computationally challenging but also conceptually difficult. Existing methods often struggle because they rely on exhaustive searches that require heavy computation. Moreover, when spatial–temporal dynamics are considered, such as in road network traffic where anomalies can propagate across different locations with variable delays, the problem becomes even more complex, as the detection method must account for both when and where anomalies occur. To address these challenges, we propose a two–stage optimization framework called <span><math><mrow><mi>M</mi><msub><mrow><mi>P</mi></mrow><mrow><mi>O</mi><mi>P</mi><mi>T</mi></mrow></msub></mrow></math></span>. In the first stage, the matrix profile is applied to signal potential anomaly locations. In the second stage, a metaheuristic optimizer refines the starting point and length of each detected signal. During refinement, Latin hypercube sampling is used to reduce the number of comparisons between candidate signals and neighboring patterns without sacrificing generalization. We validate the proposed framework using network traffic flow data from Taiwan’s freeway system. Experimental results show that <span><math><mrow><mi>M</mi><msub><mrow><mi>P</mi></mrow><mrow><mi>O</mi><mi>P</mi><mi>T</mi></mrow></msub></mrow></math></span> is at least 26 times faster than benchmarking methods while achieving up to 28.5% higher search accuracy, measured based on relative anomaly scores. These results demonstrate the practical applicability and efficiency of our work for detecting complex anomalies in network time series.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"181 \",\"pages\":\"Article 113516\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625008270\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625008270","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Mining variable-length anomalies in road network time series: A two-stage optimization framework
Detecting variable-length anomalous subsequences in network traffic is challenging due to the absence of fixed temporal patterns. Anomalies may begin at any point, last for unpredictable durations, and exhibit diverse behaviors depending on the context. Without prior knowledge of where or how long an anomaly may occur, any motif in the time series could be considered anomalous. This uncertainty increases the search complexity, as the method must explore many possible subsequences with different lengths and timings. Since labeled anomalies are often unavailable, the problem is framed as an unsupervised discovery task. It also means the methods do the search and validate anomalies without prior training. This issue makes the problem not only computationally challenging but also conceptually difficult. Existing methods often struggle because they rely on exhaustive searches that require heavy computation. Moreover, when spatial–temporal dynamics are considered, such as in road network traffic where anomalies can propagate across different locations with variable delays, the problem becomes even more complex, as the detection method must account for both when and where anomalies occur. To address these challenges, we propose a two–stage optimization framework called . In the first stage, the matrix profile is applied to signal potential anomaly locations. In the second stage, a metaheuristic optimizer refines the starting point and length of each detected signal. During refinement, Latin hypercube sampling is used to reduce the number of comparisons between candidate signals and neighboring patterns without sacrificing generalization. We validate the proposed framework using network traffic flow data from Taiwan’s freeway system. Experimental results show that is at least 26 times faster than benchmarking methods while achieving up to 28.5% higher search accuracy, measured based on relative anomaly scores. These results demonstrate the practical applicability and efficiency of our work for detecting complex anomalies in network time series.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.