道路网络时间序列中变长异常的挖掘：一个两阶段优化框架

IF 6.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing Pub Date : 2025-07-01 DOI:10.1016/j.asoc.2025.113516

Hendri Sutrisno , Frederick Kin Hing Phoa

{"title":"道路网络时间序列中变长异常的挖掘：一个两阶段优化框架","authors":"Hendri Sutrisno , Frederick Kin Hing Phoa","doi":"10.1016/j.asoc.2025.113516","DOIUrl":null,"url":null,"abstract":"<div><div>Detecting variable-length anomalous subsequences in network traffic is challenging due to the absence of fixed temporal patterns. Anomalies may begin at any point, last for unpredictable durations, and exhibit diverse behaviors depending on the context. Without prior knowledge of where or how long an anomaly may occur, any motif in the time series could be considered anomalous. This uncertainty increases the search complexity, as the method must explore many possible subsequences with different lengths and timings. Since labeled anomalies are often unavailable, the problem is framed as an unsupervised discovery task. It also means the methods do the search and validate anomalies without prior training. This issue makes the problem not only computationally challenging but also conceptually difficult. Existing methods often struggle because they rely on exhaustive searches that require heavy computation. Moreover, when spatial–temporal dynamics are considered, such as in road network traffic where anomalies can propagate across different locations with variable delays, the problem becomes even more complex, as the detection method must account for both when and where anomalies occur. To address these challenges, we propose a two–stage optimization framework called <span><math><mrow><mi>M</mi><msub><mrow><mi>P</mi></mrow><mrow><mi>O</mi><mi>P</mi><mi>T</mi></mrow></msub></mrow></math></span>. In the first stage, the matrix profile is applied to signal potential anomaly locations. In the second stage, a metaheuristic optimizer refines the starting point and length of each detected signal. During refinement, Latin hypercube sampling is used to reduce the number of comparisons between candidate signals and neighboring patterns without sacrificing generalization. We validate the proposed framework using network traffic flow data from Taiwan’s freeway system. Experimental results show that <span><math><mrow><mi>M</mi><msub><mrow><mi>P</mi></mrow><mrow><mi>O</mi><mi>P</mi><mi>T</mi></mrow></msub></mrow></math></span> is at least 26 times faster than benchmarking methods while achieving up to 28.5% higher search accuracy, measured based on relative anomaly scores. These results demonstrate the practical applicability and efficiency of our work for detecting complex anomalies in network time series.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113516"},"PeriodicalIF":6.6000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mining variable-length anomalies in road network time series: A two-stage optimization framework\",\"authors\":\"Hendri Sutrisno , Frederick Kin Hing Phoa\",\"doi\":\"10.1016/j.asoc.2025.113516\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Detecting variable-length anomalous subsequences in network traffic is challenging due to the absence of fixed temporal patterns. Anomalies may begin at any point, last for unpredictable durations, and exhibit diverse behaviors depending on the context. Without prior knowledge of where or how long an anomaly may occur, any motif in the time series could be considered anomalous. This uncertainty increases the search complexity, as the method must explore many possible subsequences with different lengths and timings. Since labeled anomalies are often unavailable, the problem is framed as an unsupervised discovery task. It also means the methods do the search and validate anomalies without prior training. This issue makes the problem not only computationally challenging but also conceptually difficult. Existing methods often struggle because they rely on exhaustive searches that require heavy computation. Moreover, when spatial–temporal dynamics are considered, such as in road network traffic where anomalies can propagate across different locations with variable delays, the problem becomes even more complex, as the detection method must account for both when and where anomalies occur. To address these challenges, we propose a two–stage optimization framework called <span><math><mrow><mi>M</mi><msub><mrow><mi>P</mi></mrow><mrow><mi>O</mi><mi>P</mi><mi>T</mi></mrow></msub></mrow></math></span>. In the first stage, the matrix profile is applied to signal potential anomaly locations. In the second stage, a metaheuristic optimizer refines the starting point and length of each detected signal. During refinement, Latin hypercube sampling is used to reduce the number of comparisons between candidate signals and neighboring patterns without sacrificing generalization. We validate the proposed framework using network traffic flow data from Taiwan’s freeway system. Experimental results show that <span><math><mrow><mi>M</mi><msub><mrow><mi>P</mi></mrow><mrow><mi>O</mi><mi>P</mi><mi>T</mi></mrow></msub></mrow></math></span> is at least 26 times faster than benchmarking methods while achieving up to 28.5% higher search accuracy, measured based on relative anomaly scores. These results demonstrate the practical applicability and efficiency of our work for detecting complex anomalies in network time series.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"181 \",\"pages\":\"Article 113516\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625008270\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625008270","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

由于缺乏固定的时间模式，检测网络流量中的变长异常子序列具有挑战性。异常可能在任何时候开始，持续不可预测的时间，并根据环境表现出不同的行为。在事先不知道异常发生的地点和时间的情况下，时间序列中的任何主题都可能被认为是异常的。这种不确定性增加了搜索复杂度，因为该方法必须探索具有不同长度和时间的许多可能的子序列。由于标记异常通常是不可用的，所以这个问题被框定为一个无监督的发现任务。这也意味着这些方法在没有事先训练的情况下进行搜索和验证异常。这个问题不仅在计算上具有挑战性，而且在概念上也很困难。现有的方法往往难以实现，因为它们依赖于需要大量计算的穷举搜索。此外，当考虑到时空动态时，例如在道路网络交通中，异常可以在不同的位置传播并具有可变延迟，问题变得更加复杂，因为检测方法必须考虑异常发生的时间和地点。为了应对这些挑战，我们提出了一个名为MPOPT的两阶段优化框架。在第一阶段，将矩阵剖面应用于信号潜在异常位置。在第二阶段，元启发式优化器细化每个检测到的信号的起点和长度。在改进过程中，使用拉丁超立方体采样来减少候选信号和相邻模式之间的比较次数，而不牺牲泛化。我们使用台湾高速公路系统的网络交通流数据来验证所提出的框架。实验结果表明，MPOPT比基准测试方法至少快26倍，同时基于相对异常分数测量的搜索准确率提高了28.5%。这些结果证明了我们的工作在网络时间序列中复杂异常检测的实用性和有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Mining variable-length anomalies in road network time series: A two-stage optimization framework

Detecting variable-length anomalous subsequences in network traffic is challenging due to the absence of fixed temporal patterns. Anomalies may begin at any point, last for unpredictable durations, and exhibit diverse behaviors depending on the context. Without prior knowledge of where or how long an anomaly may occur, any motif in the time series could be considered anomalous. This uncertainty increases the search complexity, as the method must explore many possible subsequences with different lengths and timings. Since labeled anomalies are often unavailable, the problem is framed as an unsupervised discovery task. It also means the methods do the search and validate anomalies without prior training. This issue makes the problem not only computationally challenging but also conceptually difficult. Existing methods often struggle because they rely on exhaustive searches that require heavy computation. Moreover, when spatial–temporal dynamics are considered, such as in road network traffic where anomalies can propagate across different locations with variable delays, the problem becomes even more complex, as the detection method must account for both when and where anomalies occur. To address these challenges, we propose a two–stage optimization framework called

M P_{O P T}

. In the first stage, the matrix profile is applied to signal potential anomaly locations. In the second stage, a metaheuristic optimizer refines the starting point and length of each detected signal. During refinement, Latin hypercube sampling is used to reduce the number of comparisons between candidate signals and neighboring patterns without sacrificing generalization. We validate the proposed framework using network traffic flow data from Taiwan’s freeway system. Experimental results show that

M P_{O P T}

is at least 26 times faster than benchmarking methods while achieving up to 28.5% higher search accuracy, measured based on relative anomaly scores. These results demonstrate the practical applicability and efficiency of our work for detecting complex anomalies in network time series.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Soft Computing 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

6.90%

发文量

874

审稿时长

10.9 months

期刊介绍： Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.