C. Vo-Le, Hung Sy Vo, Thien Duy Vu, Nguyen Hong Son
{"title":"Violence Detection using Feature Fusion of Optical Flow and 3D CNN on AICS-Violence Dataset","authors":"C. Vo-Le, Hung Sy Vo, Thien Duy Vu, Nguyen Hong Son","doi":"10.1109/ICCE55644.2022.9852065","DOIUrl":null,"url":null,"abstract":"In this paper, we introduce a new self-developed violence dataset named AICS-Violence. It contains 7576 high-resolution video clips of violent and non-violent scenarios collected by outdoor security cameras. It includes two different test sets with additional non-violent but seemingly violent actions as well. One of those test sets has data collected using a different camera angle compared to that of the training set and the remaining test set. To focus on each group of people in frames, we develop a method to automatically crop candidate boxes from detected human bounding boxes. Furthermore, two methods named 3D DenseNet Fusion OF RGB and 3D DenseNet Fusion OFnom RGB are proposed. In both methods, two 3D DenseNets are used to extract features from RGB and visualized optical flow respectively. Then, two different ways of fusion namely addition and normalization-based multiplication are applied to the two methods respectively. Evaluation results show that our methods achieve slightly better performance in the first test set and especially significantly better generation than those of some other state-of-the-art methods on the second test set.","PeriodicalId":388547,"journal":{"name":"2022 IEEE Ninth International Conference on Communications and Electronics (ICCE)","volume":"217 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Ninth International Conference on Communications and Electronics (ICCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCE55644.2022.9852065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In this paper, we introduce a new self-developed violence dataset named AICS-Violence. It contains 7576 high-resolution video clips of violent and non-violent scenarios collected by outdoor security cameras. It includes two different test sets with additional non-violent but seemingly violent actions as well. One of those test sets has data collected using a different camera angle compared to that of the training set and the remaining test set. To focus on each group of people in frames, we develop a method to automatically crop candidate boxes from detected human bounding boxes. Furthermore, two methods named 3D DenseNet Fusion OF RGB and 3D DenseNet Fusion OFnom RGB are proposed. In both methods, two 3D DenseNets are used to extract features from RGB and visualized optical flow respectively. Then, two different ways of fusion namely addition and normalization-based multiplication are applied to the two methods respectively. Evaluation results show that our methods achieve slightly better performance in the first test set and especially significantly better generation than those of some other state-of-the-art methods on the second test set.