{"title":"Fishy Action Recognition from Cross-Event Integrated Video Dataset Based on Deep 3D and Mixed CNN models","authors":"Raisa Begum, Md. Sajjatul Islam","doi":"10.1109/icecct52121.2021.9616808","DOIUrl":null,"url":null,"abstract":"As two crucial aspects- 1) temporal features and 2) motion characteristics are to be scrutinized apart from spatial content in analyzing video actions, therefore applying the two-dimensional CNNs alone entail fundamental problems: They do not consider the temporal ordering and eventually collapse the temporal correlations. In this paper, an automatic deep fishy action recognition system has been developed. We have implemented two spatiotemporal convolutional algorithms (pre-trained on Kinetics-400) that are capable of handling information regarding temporal, spatial, and motion in video sequences via residual learning techniques. Six categories of suspicious videos where there are only fine-grained atomic actions from five different datasets have been considered and a composite dataset has been created. To reduce over-fitting and making the dataset more robust and generalized, synthetic data have been procured through the offline augmentation processes. To the best of our knowledge this is the first attempt to recognize malicious actions across an integrated dataset comprising of five cross-event and heterogeneous datasets of actions. Therefore, we had to combat with multifarious challenges of domain shifting. We have achieved the higher level of accuracy.","PeriodicalId":155129,"journal":{"name":"2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT)","volume":"41 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icecct52121.2021.9616808","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As two crucial aspects- 1) temporal features and 2) motion characteristics are to be scrutinized apart from spatial content in analyzing video actions, therefore applying the two-dimensional CNNs alone entail fundamental problems: They do not consider the temporal ordering and eventually collapse the temporal correlations. In this paper, an automatic deep fishy action recognition system has been developed. We have implemented two spatiotemporal convolutional algorithms (pre-trained on Kinetics-400) that are capable of handling information regarding temporal, spatial, and motion in video sequences via residual learning techniques. Six categories of suspicious videos where there are only fine-grained atomic actions from five different datasets have been considered and a composite dataset has been created. To reduce over-fitting and making the dataset more robust and generalized, synthetic data have been procured through the offline augmentation processes. To the best of our knowledge this is the first attempt to recognize malicious actions across an integrated dataset comprising of five cross-event and heterogeneous datasets of actions. Therefore, we had to combat with multifarious challenges of domain shifting. We have achieved the higher level of accuracy.