Surbhi Kapoor, Akashdeep Sharma, Aman Verma, Vishal Dhull, Chahat Goyal
{"title":"A comparative study on deep learning and machine learning models for human action recognition in aerial videos","authors":"Surbhi Kapoor, Akashdeep Sharma, Aman Verma, Vishal Dhull, Chahat Goyal","doi":"10.34028/iajit/20/4/2","DOIUrl":null,"url":null,"abstract":"Unmanned Aerial Vehicle )UAV( finds its significant application in video surveillance due to its low cost, high portability and fast-mobility. In this paper, the proposed approach focuses on recognizing the human activity in aerial video sequences through various keypoints detected on the human body via OpenPose. The detected keypoints are passed onto machine learning and deep learning classifiers for classifying the human actions. Experimental results demonstrate that multilayer perceptron and SVM outperformed all the other classifiers by reporting an accuracy of 87.80% and 87.77% respectively whereas LSTM did not produce very good results as compared to other classifiers. Stacked Long Short-Term Memory networks (LSTM( produced an accuracy of 71.30% and Bidirectional LSTM yielded an accuracy of 76.04%. The results also indicate that machine learning models performed better than deep learning models. The major reason for this finding is the lesser availability of data and the deep learning models being data hungry models require a large amount of data to work upon. The paper also analyses the failure cases of OpenPose by testing the system on aerial videos captured by a drone flying at a higher altitude. This work provides a baseline for validating machine learning classifiers and deep learning classifiers against recognition of human action from aerial videos.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. Arab J. Inf. Technol.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34028/iajit/20/4/2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Unmanned Aerial Vehicle )UAV( finds its significant application in video surveillance due to its low cost, high portability and fast-mobility. In this paper, the proposed approach focuses on recognizing the human activity in aerial video sequences through various keypoints detected on the human body via OpenPose. The detected keypoints are passed onto machine learning and deep learning classifiers for classifying the human actions. Experimental results demonstrate that multilayer perceptron and SVM outperformed all the other classifiers by reporting an accuracy of 87.80% and 87.77% respectively whereas LSTM did not produce very good results as compared to other classifiers. Stacked Long Short-Term Memory networks (LSTM( produced an accuracy of 71.30% and Bidirectional LSTM yielded an accuracy of 76.04%. The results also indicate that machine learning models performed better than deep learning models. The major reason for this finding is the lesser availability of data and the deep learning models being data hungry models require a large amount of data to work upon. The paper also analyses the failure cases of OpenPose by testing the system on aerial videos captured by a drone flying at a higher altitude. This work provides a baseline for validating machine learning classifiers and deep learning classifiers against recognition of human action from aerial videos.