Real Life Violence Detection in Surveillance Videos using Spatiotemporal Features

2021 Thirteenth International Conference on Contemporary Computing (IC3-2021) Pub Date : 2021-08-05 DOI:10.1145/3474124.3474161

Anugrah Srivastava, Tapas Badal, Rishav Singh

{"title":"Real Life Violence Detection in Surveillance Videos using Spatiotemporal Features","authors":"Anugrah Srivastava, Tapas Badal, Rishav Singh","doi":"10.1145/3474124.3474161","DOIUrl":null,"url":null,"abstract":"Automatic violence detection has remarkable importance from practical and academic point of view. Generally speaking, detecting violence in a crowded locality, via computational approaches, is challenging owing to rapid movements, overlapping characteristics, obstructed scenery, and scattered backgrounds. Fortunately, Deep Learning techniques can detect anomalies to a certain extent. Furthermore, their popularity, as a paradigm to detect violence, is growing at a tremendous pace. The aim of such approaches is to develop a method that recognizes violence and evokes an alarm so that immediate assistance can be provided. This paper is aong the same line of thought. This article presents a Convolution Neural Network (CNN) and Recurrent Neural Network (RNN) based approach for violence detection by learning the detailed features in videos. The spatio-temporal features extracted from the combination of InceptonV3 pre-trained model and late LSTM architecture yielded a 97.5% accuracy thereby, proving its superiority over existing methods in literature.","PeriodicalId":144611,"journal":{"name":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3474124.3474161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Automatic violence detection has remarkable importance from practical and academic point of view. Generally speaking, detecting violence in a crowded locality, via computational approaches, is challenging owing to rapid movements, overlapping characteristics, obstructed scenery, and scattered backgrounds. Fortunately, Deep Learning techniques can detect anomalies to a certain extent. Furthermore, their popularity, as a paradigm to detect violence, is growing at a tremendous pace. The aim of such approaches is to develop a method that recognizes violence and evokes an alarm so that immediate assistance can be provided. This paper is aong the same line of thought. This article presents a Convolution Neural Network (CNN) and Recurrent Neural Network (RNN) based approach for violence detection by learning the detailed features in videos. The spatio-temporal features extracted from the combination of InceptonV3 pre-trained model and late LSTM architecture yielded a 97.5% accuracy thereby, proving its superiority over existing methods in literature.

查看原文本刊更多论文

基于时空特征的监控视频真实暴力检测

暴力自动检测在实践和学术上都具有重要的意义。一般来说，在拥挤的地方，通过计算方法检测暴力是具有挑战性的，因为快速的运动、重叠的特征、受阻的风景和分散的背景。幸运的是，深度学习技术可以在一定程度上检测异常。此外，作为一种检测暴力的范例，它们的受欢迎程度正在以惊人的速度增长。这些办法的目的是发展一种识别暴力和引起警报的方法，以便能够立即提供援助。本文的思路与此相同。本文提出了一种基于卷积神经网络(CNN)和递归神经网络(RNN)的方法，通过学习视频中的详细特征来进行暴力检测。结合InceptonV3预训练模型和后期LSTM架构提取时空特征，准确率达到97.5%，证明了其优于现有文献方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)

自引率

0.00%

发文量