Efficient Violence Detection Using 3D Convolutional Neural Networks

2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) Pub Date : 2019-09-01 DOI:10.1109/AVSS.2019.8909883

Ji Li, Xinghao Jiang, Tanfeng Sun, Ke Xu

{"title":"Efficient Violence Detection Using 3D Convolutional Neural Networks","authors":"Ji Li, Xinghao Jiang, Tanfeng Sun, Ke Xu","doi":"10.1109/AVSS.2019.8909883","DOIUrl":null,"url":null,"abstract":"Automatically analyzing violent content in surveillance videos is of profound significance on many applications, ranging from Internet video filtration to public security protection. In this paper, we propose a deep learning model based on 3D convolutional neural networks, without using hand-crafted features or RNN architectures exclusively for encoding temporal information. The improved internal designs adopt compact but effective bottleneck units for learning motion patterns and leverage the DenseNet architecture to promote feature reusing and channel interaction, which is proved to be more capable of capturing spatiotemporal features and requires relatively fewer parameters. The performance of the proposed model is validated on three standard datasets in terms of recognition accuracy compared to other advanced approaches. Meanwhile, supplementary experiments are carried out to evaluate its effectiveness and efficiency. The final results demonstrate the advantages of the proposed model over the state-of-the-art methods in both recognition accuracy and computational efficiency.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AVSS.2019.8909883","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 38

Abstract

Automatically analyzing violent content in surveillance videos is of profound significance on many applications, ranging from Internet video filtration to public security protection. In this paper, we propose a deep learning model based on 3D convolutional neural networks, without using hand-crafted features or RNN architectures exclusively for encoding temporal information. The improved internal designs adopt compact but effective bottleneck units for learning motion patterns and leverage the DenseNet architecture to promote feature reusing and channel interaction, which is proved to be more capable of capturing spatiotemporal features and requires relatively fewer parameters. The performance of the proposed model is validated on three standard datasets in terms of recognition accuracy compared to other advanced approaches. Meanwhile, supplementary experiments are carried out to evaluate its effectiveness and efficiency. The final results demonstrate the advantages of the proposed model over the state-of-the-art methods in both recognition accuracy and computational efficiency.

查看原文本刊更多论文

基于三维卷积神经网络的高效暴力检测

从网络视频过滤到公共安全保护，监控视频中的暴力内容自动分析在许多应用中都具有深远的意义。在本文中，我们提出了一种基于3D卷积神经网络的深度学习模型，而不使用手工制作的特征或RNN架构专门用于编码时间信息。改进的内部设计采用紧凑但有效的瓶颈单元来学习运动模式，并利用DenseNet架构来促进特征重用和通道交互，这被证明更有能力捕获时空特征，并且需要相对较少的参数。在三个标准数据集上对该模型的识别精度进行了验证，并与其他先进方法进行了比较。同时进行了补充实验，对其有效性和效率进行了评价。最后的结果表明，所提出的模型在识别精度和计算效率方面都优于目前最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)

自引率

0.00%

发文量