逆向扩散用于少镜头场景自适应视频异常检测

IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yumna Zahid , Christine Zarges , Bernie Tiddeman , Jungong Han
{"title":"逆向扩散用于少镜头场景自适应视频异常检测","authors":"Yumna Zahid ,&nbsp;Christine Zarges ,&nbsp;Bernie Tiddeman ,&nbsp;Jungong Han","doi":"10.1016/j.neucom.2024.128796","DOIUrl":null,"url":null,"abstract":"<div><div>Few-shot anomaly detection for video surveillance is challenging due to the diverse nature of target domains. Existing methodologies treat it as a one-class classification problem, training on a reduced sample of nominal scenes. The focus is on either reconstructive or predictive frame methodologies to learn a manifold against which outliers can be detected during inference. We posit that the quality of image reconstruction or future frame prediction is inherently important in identifying anomalous pixels in video frames. In this paper, we enhance the image synthesis and mode coverage for video anomaly detection (VAD) by integrating a <em>Denoising Diffusion</em> model with a future frame prediction model. Our novel VAD pipeline includes a <em>Generative Adversarial Network</em> combined with denoising diffusion to learn the underlying non-anomalous data distribution and generate in one-step high fidelity future-frame samples. We further regularize the image reconstruction with perceptual quality metrics such as <em>Multi-scale Structural Similarity Index Measure</em> and <em>Peak Signal-to-Noise Ratio</em>, ensuring high-quality output under few episodic training iterations. Extensive experiments demonstrate that our method outperforms state-of-the-art techniques across multiple benchmarks, validating that high-quality image synthesis in frame prediction leads to robust anomaly detection in videos.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adversarial diffusion for few-shot scene adaptive video anomaly detection\",\"authors\":\"Yumna Zahid ,&nbsp;Christine Zarges ,&nbsp;Bernie Tiddeman ,&nbsp;Jungong Han\",\"doi\":\"10.1016/j.neucom.2024.128796\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Few-shot anomaly detection for video surveillance is challenging due to the diverse nature of target domains. Existing methodologies treat it as a one-class classification problem, training on a reduced sample of nominal scenes. The focus is on either reconstructive or predictive frame methodologies to learn a manifold against which outliers can be detected during inference. We posit that the quality of image reconstruction or future frame prediction is inherently important in identifying anomalous pixels in video frames. In this paper, we enhance the image synthesis and mode coverage for video anomaly detection (VAD) by integrating a <em>Denoising Diffusion</em> model with a future frame prediction model. Our novel VAD pipeline includes a <em>Generative Adversarial Network</em> combined with denoising diffusion to learn the underlying non-anomalous data distribution and generate in one-step high fidelity future-frame samples. We further regularize the image reconstruction with perceptual quality metrics such as <em>Multi-scale Structural Similarity Index Measure</em> and <em>Peak Signal-to-Noise Ratio</em>, ensuring high-quality output under few episodic training iterations. Extensive experiments demonstrate that our method outperforms state-of-the-art techniques across multiple benchmarks, validating that high-quality image synthesis in frame prediction leads to robust anomaly detection in videos.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231224015674\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224015674","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

由于目标领域的多样性,视频监控的少镜头异常检测具有挑战性。现有的方法将其视为单类分类问题,在减少的标称场景样本上进行训练。重点在于重建或预测帧方法,以学习一个流形,在推理过程中可根据该流形检测异常值。我们认为,图像重建或未来帧预测的质量对于识别视频帧中的异常像素至关重要。在本文中,我们通过整合去噪扩散模型和未来帧预测模型,提高了视频异常检测(VAD)的图像合成和模式覆盖率。我们新颖的 VAD 管道包括一个生成对抗网络(Generative Adversarial Network),该网络与去噪扩散相结合,可学习底层非异常数据分布,并一步生成高保真的未来帧样本。我们还利用多尺度结构相似性指数测量和峰值信噪比等感知质量指标对图像重建进行了进一步的规范化处理,确保在少量偶发训练迭代的情况下实现高质量的输出。广泛的实验证明,我们的方法在多个基准测试中的表现优于最先进的技术,从而验证了在帧预测中进行高质量图像合成可实现稳健的视频异常检测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Adversarial diffusion for few-shot scene adaptive video anomaly detection
Few-shot anomaly detection for video surveillance is challenging due to the diverse nature of target domains. Existing methodologies treat it as a one-class classification problem, training on a reduced sample of nominal scenes. The focus is on either reconstructive or predictive frame methodologies to learn a manifold against which outliers can be detected during inference. We posit that the quality of image reconstruction or future frame prediction is inherently important in identifying anomalous pixels in video frames. In this paper, we enhance the image synthesis and mode coverage for video anomaly detection (VAD) by integrating a Denoising Diffusion model with a future frame prediction model. Our novel VAD pipeline includes a Generative Adversarial Network combined with denoising diffusion to learn the underlying non-anomalous data distribution and generate in one-step high fidelity future-frame samples. We further regularize the image reconstruction with perceptual quality metrics such as Multi-scale Structural Similarity Index Measure and Peak Signal-to-Noise Ratio, ensuring high-quality output under few episodic training iterations. Extensive experiments demonstrate that our method outperforms state-of-the-art techniques across multiple benchmarks, validating that high-quality image synthesis in frame prediction leads to robust anomaly detection in videos.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neurocomputing
Neurocomputing 工程技术-计算机:人工智能
CiteScore
13.10
自引率
10.00%
发文量
1382
审稿时长
70 days
期刊介绍: Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信