{"title":"Machine Cognition of Violence in Videos Using Novel Outlier-Resistant VLAD","authors":"Tonmoay Deb, Aziz Arman, A. Firoze","doi":"10.1109/ICMLA.2018.00161","DOIUrl":null,"url":null,"abstract":"Understanding highly accurate and real-time violent actions from surveillance videos is a demanding challenge. Our primary contribution of this work is divided into two parts. Firstly, we propose a computationally efficient Bag-of-Words (BoW) pipeline along with improved accuracy of violent videos classification. The novel pipeline's feature extraction stage is implemented with densely sampled Histogram of Oriented Gradients (HOG) and Histogram of Optical Flow (HOF) descriptors rather than Space-Time Interest Point (STIP) based extraction. Secondly, in encoding stage, we propose Outlier-Resistant VLAD (OR-VLAD), a novel higher order statistics-based feature encoding, to improve the original VLAD performance. In classification, efficient Linear Support Vector Machine (LSVM) is employed. The performance of the proposed pipeline is evaluated with three popular violent action datasets. On comparison, our pipeline achieved near perfect classification accuracies over three standard video datasets, outperforming most state-of-the-art approaches and having very low number of vocabulary size compared to previous BoW Models.","PeriodicalId":6533,"journal":{"name":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"24 1","pages":"989-994"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2018.00161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Understanding highly accurate and real-time violent actions from surveillance videos is a demanding challenge. Our primary contribution of this work is divided into two parts. Firstly, we propose a computationally efficient Bag-of-Words (BoW) pipeline along with improved accuracy of violent videos classification. The novel pipeline's feature extraction stage is implemented with densely sampled Histogram of Oriented Gradients (HOG) and Histogram of Optical Flow (HOF) descriptors rather than Space-Time Interest Point (STIP) based extraction. Secondly, in encoding stage, we propose Outlier-Resistant VLAD (OR-VLAD), a novel higher order statistics-based feature encoding, to improve the original VLAD performance. In classification, efficient Linear Support Vector Machine (LSVM) is employed. The performance of the proposed pipeline is evaluated with three popular violent action datasets. On comparison, our pipeline achieved near perfect classification accuracies over three standard video datasets, outperforming most state-of-the-art approaches and having very low number of vocabulary size compared to previous BoW Models.