{"title":"无监督视频异常检测的跨尺度时空记忆增强网络","authors":"Lihu Pan, Bingyi Li, Shouxin Peng, Rui Zhang, Linliang Zhang","doi":"10.1002/cpe.70315","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Video anomaly detection (VAD), a critical task in intelligent surveillance systems, faces two key challenges: Dynamic behavioral characterization under complex scenarios and robust spatiotemporal context modeling. Existing methods face limitations, such as inadequate cross-scale feature fusion, weak channel-wise dependency modeling, and sensitivity to background noise. To address these issues, we propose a novel multi-scale spatiotemporal feature augmentation framework. Our approach introduces three core innovations: Hierarchical feature pyramid architecture for multi-granularity representation learning, capturing both local motion patterns and global scene semantics; A channel-adaptive attention mechanism that dynamically models long-range spatiotemporal dependencies; A spatiotemporal Gaussian difference module to enhance anomaly response through frequency-domain feature reconstruction, effectively suppressing noise interference. Extensive experiments on UCSD Ped1/2, CUHK Avenue, and ShanghaiTech benchmarks demonstrate that our method achieves state-of-the-art performance, outperforming existing approaches in both accuracy and robustness.</p>\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 25-26","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross-Scale Spatiotemporal Memory-Augmented Network for Unsupervised Video Anomaly Detection\",\"authors\":\"Lihu Pan, Bingyi Li, Shouxin Peng, Rui Zhang, Linliang Zhang\",\"doi\":\"10.1002/cpe.70315\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Video anomaly detection (VAD), a critical task in intelligent surveillance systems, faces two key challenges: Dynamic behavioral characterization under complex scenarios and robust spatiotemporal context modeling. Existing methods face limitations, such as inadequate cross-scale feature fusion, weak channel-wise dependency modeling, and sensitivity to background noise. To address these issues, we propose a novel multi-scale spatiotemporal feature augmentation framework. Our approach introduces three core innovations: Hierarchical feature pyramid architecture for multi-granularity representation learning, capturing both local motion patterns and global scene semantics; A channel-adaptive attention mechanism that dynamically models long-range spatiotemporal dependencies; A spatiotemporal Gaussian difference module to enhance anomaly response through frequency-domain feature reconstruction, effectively suppressing noise interference. Extensive experiments on UCSD Ped1/2, CUHK Avenue, and ShanghaiTech benchmarks demonstrate that our method achieves state-of-the-art performance, outperforming existing approaches in both accuracy and robustness.</p>\\n </div>\",\"PeriodicalId\":55214,\"journal\":{\"name\":\"Concurrency and Computation-Practice & Experience\",\"volume\":\"37 25-26\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Concurrency and Computation-Practice & Experience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70315\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70315","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Cross-Scale Spatiotemporal Memory-Augmented Network for Unsupervised Video Anomaly Detection
Video anomaly detection (VAD), a critical task in intelligent surveillance systems, faces two key challenges: Dynamic behavioral characterization under complex scenarios and robust spatiotemporal context modeling. Existing methods face limitations, such as inadequate cross-scale feature fusion, weak channel-wise dependency modeling, and sensitivity to background noise. To address these issues, we propose a novel multi-scale spatiotemporal feature augmentation framework. Our approach introduces three core innovations: Hierarchical feature pyramid architecture for multi-granularity representation learning, capturing both local motion patterns and global scene semantics; A channel-adaptive attention mechanism that dynamically models long-range spatiotemporal dependencies; A spatiotemporal Gaussian difference module to enhance anomaly response through frequency-domain feature reconstruction, effectively suppressing noise interference. Extensive experiments on UCSD Ped1/2, CUHK Avenue, and ShanghaiTech benchmarks demonstrate that our method achieves state-of-the-art performance, outperforming existing approaches in both accuracy and robustness.
期刊介绍:
Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of:
Parallel and distributed computing;
High-performance computing;
Computational and data science;
Artificial intelligence and machine learning;
Big data applications, algorithms, and systems;
Network science;
Ontologies and semantics;
Security and privacy;
Cloud/edge/fog computing;
Green computing; and
Quantum computing.