基于帧记忆库和解耦非对称卷积的视频异常检测

IF 1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging Pub Date : 2024-09-01 DOI:10.1117/1.jei.33.5.053006

Min Zhao, Chuanxu Wang, Jiajiong Li, Zitai Jiang

{"title":"基于帧记忆库和解耦非对称卷积的视频异常检测","authors":"Min Zhao, Chuanxu Wang, Jiajiong Li, Zitai Jiang","doi":"10.1117/1.jei.33.5.053006","DOIUrl":null,"url":null,"abstract":"Video anomaly detection (VAD) is essential for monitoring systems. The prediction-based methods identify anomalies by comparing differences between the predicted and real frames. We propose an unsupervised VAD method based on frame memory bank (FMB) and decoupled asymmetric convolution (DAConv), which addresses three problems encountered with auto-encoders (AE) in VAD: (1) how to mitigate the noise resulting from jittering between frames, which is ignored; (2) how to alleviate the insufficient utilization of temporal information by traditional two-dimensional (2D) convolution and the burden for more computing resources in three-dimensional (3D) convolution; and (3) how to make full use of normal data to improve the reliability of anomaly discrimination. Specifically, we initially design a separate network to calibrate video frames within the dataset. Second, we design DAConv to extract features from the video, addressing the absence of temporal dimension information in 2D convolutions and the high computational complexity of 3D convolutions. Concurrently, the interval-frame mechanism mitigates the problem of information redundancy caused by data reuse. Finally, we embed an FMB to store features of normal events, amplifying the contrast between normal and abnormal frames. We conduct extensive experiments on the UCSD Ped2, CUHK Avenue, and ShanghaiTech datasets, achieving AUC values of 98.7%, 90.4%, and 74.8%, respectively, which fully demonstrates the rationality and effectiveness of the proposed method.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"105 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Video anomaly detection based on frame memory bank and decoupled asymmetric convolutions\",\"authors\":\"Min Zhao, Chuanxu Wang, Jiajiong Li, Zitai Jiang\",\"doi\":\"10.1117/1.jei.33.5.053006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Video anomaly detection (VAD) is essential for monitoring systems. The prediction-based methods identify anomalies by comparing differences between the predicted and real frames. We propose an unsupervised VAD method based on frame memory bank (FMB) and decoupled asymmetric convolution (DAConv), which addresses three problems encountered with auto-encoders (AE) in VAD: (1) how to mitigate the noise resulting from jittering between frames, which is ignored; (2) how to alleviate the insufficient utilization of temporal information by traditional two-dimensional (2D) convolution and the burden for more computing resources in three-dimensional (3D) convolution; and (3) how to make full use of normal data to improve the reliability of anomaly discrimination. Specifically, we initially design a separate network to calibrate video frames within the dataset. Second, we design DAConv to extract features from the video, addressing the absence of temporal dimension information in 2D convolutions and the high computational complexity of 3D convolutions. Concurrently, the interval-frame mechanism mitigates the problem of information redundancy caused by data reuse. Finally, we embed an FMB to store features of normal events, amplifying the contrast between normal and abnormal frames. We conduct extensive experiments on the UCSD Ped2, CUHK Avenue, and ShanghaiTech datasets, achieving AUC values of 98.7%, 90.4%, and 74.8%, respectively, which fully demonstrates the rationality and effectiveness of the proposed method.\",\"PeriodicalId\":54843,\"journal\":{\"name\":\"Journal of Electronic Imaging\",\"volume\":\"105 1\",\"pages\":\"\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Electronic Imaging\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1117/1.jei.33.5.053006\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Electronic Imaging","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1117/1.jei.33.5.053006","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

视频异常检测（VAD）对监控系统至关重要。基于预测的方法通过比较预测帧和真实帧之间的差异来识别异常。我们提出了一种基于帧记忆库（FMB）和解耦非对称卷积（DAConv）的无监督 VAD 方法，该方法解决了自动编码器（AE）在 VAD 中遇到的三个问题：(1) 如何降低被忽略的帧间抖动产生的噪声；(2) 如何减轻传统二维卷积（2D）对时间信息利用不足和三维卷积（3D）对更多计算资源造成的负担；以及 (3) 如何充分利用正常数据来提高异常判别的可靠性。具体来说，我们首先设计了一个单独的网络来校准数据集中的视频帧。其次，我们设计了 DAConv 从视频中提取特征，解决了二维卷积中缺乏时间维度信息和三维卷积计算复杂度高的问题。同时，间隔帧机制减轻了数据重复使用造成的信息冗余问题。最后，我们嵌入了一个 FMB 来存储正常事件的特征，从而扩大了正常帧和异常帧之间的对比度。我们在 UCSD Ped2、CUHK Avenue 和 ShanghaiTech 数据集上进行了大量实验，AUC 值分别达到 98.7%、90.4% 和 74.8%，充分证明了所提方法的合理性和有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Video anomaly detection based on frame memory bank and decoupled asymmetric convolutions

Video anomaly detection (VAD) is essential for monitoring systems. The prediction-based methods identify anomalies by comparing differences between the predicted and real frames. We propose an unsupervised VAD method based on frame memory bank (FMB) and decoupled asymmetric convolution (DAConv), which addresses three problems encountered with auto-encoders (AE) in VAD: (1) how to mitigate the noise resulting from jittering between frames, which is ignored; (2) how to alleviate the insufficient utilization of temporal information by traditional two-dimensional (2D) convolution and the burden for more computing resources in three-dimensional (3D) convolution; and (3) how to make full use of normal data to improve the reliability of anomaly discrimination. Specifically, we initially design a separate network to calibrate video frames within the dataset. Second, we design DAConv to extract features from the video, addressing the absence of temporal dimension information in 2D convolutions and the high computational complexity of 3D convolutions. Concurrently, the interval-frame mechanism mitigates the problem of information redundancy caused by data reuse. Finally, we embed an FMB to store features of normal events, amplifying the contrast between normal and abnormal frames. We conduct extensive experiments on the UCSD Ped2, CUHK Avenue, and ShanghaiTech datasets, achieving AUC values of 98.7%, 90.4%, and 74.8%, respectively, which fully demonstrates the rationality and effectiveness of the proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Electronic Imaging 工程技术-成像科学与照相技术

CiteScore

1.70

自引率

27.30%

发文量

341

审稿时长

4.0 months

期刊介绍： The Journal of Electronic Imaging publishes peer-reviewed papers in all technology areas that make up the field of electronic imaging and are normally considered in the design, engineering, and applications of electronic imaging systems.