Konstantinos Gkountakos, K. Ioannidis, T. Tsikrika, S. Vrochidis, Y. Kompatsiaris
{"title":"Crowd Violence Detection from Video Footage","authors":"Konstantinos Gkountakos, K. Ioannidis, T. Tsikrika, S. Vrochidis, Y. Kompatsiaris","doi":"10.1109/CBMI50038.2021.9461921","DOIUrl":null,"url":null,"abstract":"Surveillance systems currently deploy a variety of devices that can capture visual content (such as CCTV, body-worn cameras, and smartphone cameras), thus rendering the monitoring of video footage obtained from multiple such devices a complex task. This becomes especially challenging when monitoring social events that involve large crowds, particularly when there is a risk of crowd violence. This paper presents and demonstrates a crowd violence detection system that can process, analyze, and alert potential stakeholders, when violence-related content is identified in crowd-based video footage. Based on deep neural networks, the proposed end-to-end framework utilizes a 3D Convolutional Neural Network (CNN) to deal with the (near) real-time analysis of video streams and video files for crowd violence detection. The framework is trained, evaluated, and demonstrated using the Violent Flows dataset, a dataset related to crowd violence that is widely used for research. The presented framework is provided as a standalone application for desktop environments and can analyze both video streams and video files.","PeriodicalId":289262,"journal":{"name":"2021 International Conference on Content-Based Multimedia Indexing (CBMI)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Content-Based Multimedia Indexing (CBMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMI50038.2021.9461921","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Surveillance systems currently deploy a variety of devices that can capture visual content (such as CCTV, body-worn cameras, and smartphone cameras), thus rendering the monitoring of video footage obtained from multiple such devices a complex task. This becomes especially challenging when monitoring social events that involve large crowds, particularly when there is a risk of crowd violence. This paper presents and demonstrates a crowd violence detection system that can process, analyze, and alert potential stakeholders, when violence-related content is identified in crowd-based video footage. Based on deep neural networks, the proposed end-to-end framework utilizes a 3D Convolutional Neural Network (CNN) to deal with the (near) real-time analysis of video streams and video files for crowd violence detection. The framework is trained, evaluated, and demonstrated using the Violent Flows dataset, a dataset related to crowd violence that is widely used for research. The presented framework is provided as a standalone application for desktop environments and can analyze both video streams and video files.