{"title":"智能视频监控系统中的变压器支持弱监督异常事件检测","authors":"Shalmiya Paulraj , Subramaniyaswamy Vairavasundaram","doi":"10.1016/j.engappai.2024.109496","DOIUrl":null,"url":null,"abstract":"<div><div>Video Anomaly Detection (VAD) for weakly supervised data operates with limited video-level annotations. It also holds the practical significance to play a pivotal role in surveillance and security applications like public safety, patient monitoring, autonomous vehicles, etc. Moreover, VAD extends its utility to various industrial settings, where it is instrumental in safeguarding workers' safety, enabling real-time production quality monitoring, and predictive maintenance. These diverse applications highlight the versatility of VAD and its potential to transform processes across various industries, making it an essential tool along with traditional surveillance applications. The majority of the existing studies have been focused on mitigating critical aspects of VAD, such as reducing false alarm rates and misdetection. These challenges can be effectively addressed by capturing the intricate spatiotemporal pattern within video data. Therefore, the proposed work named Swin Transformer-based Hybrid Temporal Adaptive Module (ST-HTAM) Abnormal Event Detection introduces an intuitive temporal module along with leveraging the strengths of the Swin (Shifted window-based) Transformers for spatial analysis. The novel aspect of this work lies in the hybridization of global self-attention and Convolutional-Long Short Term Memory (C-LSTM) Networks are renowned for capturing both global and local temporal dependencies. By extracting these spatial and temporal components, the proposed method, ST-HTAM, offers a comprehensive understanding of anomalous events. Altogether, it enhances the accuracy and robustness of Weakly Supervised VAD (WS-VAD). Finally, an anomaly scoring mechanism is employed in the classification step to facilitate effective anomaly detection from test video data. The proposed system is tailored to operate in real-time and highlights the dual focus on sophisticated Artificial Intelligence (AI) techniques and their impactful use cases across diverse domains. Comprehensive experiments are conducted on benchmark datasets that clearly show the substantial superiority of the ST-HTAM over state-of-the-art approaches. Code is available at <span><span>https://github.com/Shalmiyapaulraj78/STHTAM-VAD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transformer-enabled weakly supervised abnormal event detection in intelligent video surveillance systems\",\"authors\":\"Shalmiya Paulraj , Subramaniyaswamy Vairavasundaram\",\"doi\":\"10.1016/j.engappai.2024.109496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Video Anomaly Detection (VAD) for weakly supervised data operates with limited video-level annotations. It also holds the practical significance to play a pivotal role in surveillance and security applications like public safety, patient monitoring, autonomous vehicles, etc. Moreover, VAD extends its utility to various industrial settings, where it is instrumental in safeguarding workers' safety, enabling real-time production quality monitoring, and predictive maintenance. These diverse applications highlight the versatility of VAD and its potential to transform processes across various industries, making it an essential tool along with traditional surveillance applications. The majority of the existing studies have been focused on mitigating critical aspects of VAD, such as reducing false alarm rates and misdetection. These challenges can be effectively addressed by capturing the intricate spatiotemporal pattern within video data. Therefore, the proposed work named Swin Transformer-based Hybrid Temporal Adaptive Module (ST-HTAM) Abnormal Event Detection introduces an intuitive temporal module along with leveraging the strengths of the Swin (Shifted window-based) Transformers for spatial analysis. The novel aspect of this work lies in the hybridization of global self-attention and Convolutional-Long Short Term Memory (C-LSTM) Networks are renowned for capturing both global and local temporal dependencies. By extracting these spatial and temporal components, the proposed method, ST-HTAM, offers a comprehensive understanding of anomalous events. Altogether, it enhances the accuracy and robustness of Weakly Supervised VAD (WS-VAD). Finally, an anomaly scoring mechanism is employed in the classification step to facilitate effective anomaly detection from test video data. The proposed system is tailored to operate in real-time and highlights the dual focus on sophisticated Artificial Intelligence (AI) techniques and their impactful use cases across diverse domains. Comprehensive experiments are conducted on benchmark datasets that clearly show the substantial superiority of the ST-HTAM over state-of-the-art approaches. Code is available at <span><span>https://github.com/Shalmiyapaulraj78/STHTAM-VAD</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197624016543\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624016543","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
针对弱监督数据的视频异常检测(VAD)可在有限的视频级注释下运行。它在公共安全、病人监控、自动驾驶汽车等监控和安全应用中发挥着举足轻重的作用。此外,VAD 还可应用于各种工业环境,在保障工人安全、实现实时生产质量监控和预测性维护方面发挥重要作用。这些多样化的应用凸显了 VAD 的多功能性及其改变各行业流程的潜力,使其成为传统监控应用的重要工具。现有的大部分研究都集中在减少 VAD 的关键方面,如降低误报率和错误检测。通过捕捉视频数据中错综复杂的时空模式,可以有效地应对这些挑战。因此,这项名为 "基于斯文变换器的混合时态自适应模块(ST-HTAM)异常事件检测 "的工作引入了一个直观的时态模块,并利用斯文(基于移位窗口的)变换器的优势进行空间分析。这项工作的新颖之处在于将全局自我注意与卷积-长短期记忆(C-LSTM)网络进行了混合,后者在捕捉全局和局部时间依赖性方面享有盛誉。通过提取这些空间和时间成分,所提出的 ST-HTAM 方法可以全面了解异常事件。总之,它提高了弱监督 VAD(WS-VAD)的准确性和鲁棒性。最后,在分类步骤中采用了异常评分机制,以促进从测试视频数据中进行有效的异常检测。所提出的系统是为实时运行而量身定制的,突出了对复杂的人工智能(AI)技术及其在不同领域的有影响力的用例的双重关注。在基准数据集上进行的综合实验清楚地表明,ST-HTAM 比最先进的方法更具实质性优势。代码见 https://github.com/Shalmiyapaulraj78/STHTAM-VAD。
Transformer-enabled weakly supervised abnormal event detection in intelligent video surveillance systems
Video Anomaly Detection (VAD) for weakly supervised data operates with limited video-level annotations. It also holds the practical significance to play a pivotal role in surveillance and security applications like public safety, patient monitoring, autonomous vehicles, etc. Moreover, VAD extends its utility to various industrial settings, where it is instrumental in safeguarding workers' safety, enabling real-time production quality monitoring, and predictive maintenance. These diverse applications highlight the versatility of VAD and its potential to transform processes across various industries, making it an essential tool along with traditional surveillance applications. The majority of the existing studies have been focused on mitigating critical aspects of VAD, such as reducing false alarm rates and misdetection. These challenges can be effectively addressed by capturing the intricate spatiotemporal pattern within video data. Therefore, the proposed work named Swin Transformer-based Hybrid Temporal Adaptive Module (ST-HTAM) Abnormal Event Detection introduces an intuitive temporal module along with leveraging the strengths of the Swin (Shifted window-based) Transformers for spatial analysis. The novel aspect of this work lies in the hybridization of global self-attention and Convolutional-Long Short Term Memory (C-LSTM) Networks are renowned for capturing both global and local temporal dependencies. By extracting these spatial and temporal components, the proposed method, ST-HTAM, offers a comprehensive understanding of anomalous events. Altogether, it enhances the accuracy and robustness of Weakly Supervised VAD (WS-VAD). Finally, an anomaly scoring mechanism is employed in the classification step to facilitate effective anomaly detection from test video data. The proposed system is tailored to operate in real-time and highlights the dual focus on sophisticated Artificial Intelligence (AI) techniques and their impactful use cases across diverse domains. Comprehensive experiments are conducted on benchmark datasets that clearly show the substantial superiority of the ST-HTAM over state-of-the-art approaches. Code is available at https://github.com/Shalmiyapaulraj78/STHTAM-VAD.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.