Lijun Han, Gang Liang, Pengcheng Wang, Dingming Liu, Kui Zhao
{"title":"基于渐进式上下文集成的流媒体框架,用于多时间尺度视频异常检测","authors":"Lijun Han, Gang Liang, Pengcheng Wang, Dingming Liu, Kui Zhao","doi":"10.1016/j.neucom.2025.131669","DOIUrl":null,"url":null,"abstract":"<div><div>Video anomaly detection (VAD) plays a crucial role in intelligent surveillance systems by identifying abnormal events in video streams. However, most existing methods either rely on isolated feature extraction—failing to model inter-action contextual relationships critical for complex anomaly recognition—or demand full-video processing via graph/hierarchical architectures, which incur high latency, computational burden, and parameter/memory inefficiency with depth. Lightweight designs mitigate costs but sacrifice temporal sensitivity through shallow networks and short-clip inputs, limiting detection of subtle or multi-scale anomalies in streaming scenarios. To address these challenges, we propose StreamVAD, a lightweight streaming anomaly detection framework that achieves low-latency, long-term temporal modeling with minimal computational overhead. A Key Clip Generator (KCG) filters redundant inputs in a streaming manner, allowing the model to focus on informative content while reducing computational cost. A progressive context integration (PCI) module incrementally expands the temporal receptive field by integrating historical context without full-sequence buffering, enabling efficient detection of complex long-term anomalies. Additionally, a multi-scale temporal selection (MTS) strategy dynamically adapts temporal resolution to capture both short- and long-term abnormalities. Extensive experiments on UCF-Crime, XD-Violence, and a supplemental long-term anomaly dataset demonstrate that StreamVAD achieves effective video anomaly detection with fewer parameters and lower latency. The code and dataset are available at <span><span>https://github.com/Han-lijun/StreamVAD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"658 ","pages":"Article 131669"},"PeriodicalIF":6.5000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"StreamVAD: A streaming framework with progressive context integration for multi-temporal scale video anomaly detection\",\"authors\":\"Lijun Han, Gang Liang, Pengcheng Wang, Dingming Liu, Kui Zhao\",\"doi\":\"10.1016/j.neucom.2025.131669\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Video anomaly detection (VAD) plays a crucial role in intelligent surveillance systems by identifying abnormal events in video streams. However, most existing methods either rely on isolated feature extraction—failing to model inter-action contextual relationships critical for complex anomaly recognition—or demand full-video processing via graph/hierarchical architectures, which incur high latency, computational burden, and parameter/memory inefficiency with depth. Lightweight designs mitigate costs but sacrifice temporal sensitivity through shallow networks and short-clip inputs, limiting detection of subtle or multi-scale anomalies in streaming scenarios. To address these challenges, we propose StreamVAD, a lightweight streaming anomaly detection framework that achieves low-latency, long-term temporal modeling with minimal computational overhead. A Key Clip Generator (KCG) filters redundant inputs in a streaming manner, allowing the model to focus on informative content while reducing computational cost. A progressive context integration (PCI) module incrementally expands the temporal receptive field by integrating historical context without full-sequence buffering, enabling efficient detection of complex long-term anomalies. Additionally, a multi-scale temporal selection (MTS) strategy dynamically adapts temporal resolution to capture both short- and long-term abnormalities. Extensive experiments on UCF-Crime, XD-Violence, and a supplemental long-term anomaly dataset demonstrate that StreamVAD achieves effective video anomaly detection with fewer parameters and lower latency. The code and dataset are available at <span><span>https://github.com/Han-lijun/StreamVAD</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"658 \",\"pages\":\"Article 131669\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225023410\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225023410","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
StreamVAD: A streaming framework with progressive context integration for multi-temporal scale video anomaly detection
Video anomaly detection (VAD) plays a crucial role in intelligent surveillance systems by identifying abnormal events in video streams. However, most existing methods either rely on isolated feature extraction—failing to model inter-action contextual relationships critical for complex anomaly recognition—or demand full-video processing via graph/hierarchical architectures, which incur high latency, computational burden, and parameter/memory inefficiency with depth. Lightweight designs mitigate costs but sacrifice temporal sensitivity through shallow networks and short-clip inputs, limiting detection of subtle or multi-scale anomalies in streaming scenarios. To address these challenges, we propose StreamVAD, a lightweight streaming anomaly detection framework that achieves low-latency, long-term temporal modeling with minimal computational overhead. A Key Clip Generator (KCG) filters redundant inputs in a streaming manner, allowing the model to focus on informative content while reducing computational cost. A progressive context integration (PCI) module incrementally expands the temporal receptive field by integrating historical context without full-sequence buffering, enabling efficient detection of complex long-term anomalies. Additionally, a multi-scale temporal selection (MTS) strategy dynamically adapts temporal resolution to capture both short- and long-term abnormalities. Extensive experiments on UCF-Crime, XD-Violence, and a supplemental long-term anomaly dataset demonstrate that StreamVAD achieves effective video anomaly detection with fewer parameters and lower latency. The code and dataset are available at https://github.com/Han-lijun/StreamVAD.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.