Implementation of Convolutional Neural Network and Long Short-Term Memory Algorithms in Human Activity Recognition Based on Visual Processing Video

Q3 Decision Sciences

JOIV International Journal on Informatics Visualization Pub Date : 2023-05-12 DOI:10.30630/joiv.7.2.1504

A. Rachman, H. Mubarok, Euis Nur Fitriani Dewi, Rama Edwinda Putra

{"title":"Implementation of Convolutional Neural Network and Long Short-Term Memory Algorithms in Human Activity Recognition Based on Visual Processing Video","authors":"A. Rachman, H. Mubarok, Euis Nur Fitriani Dewi, Rama Edwinda Putra","doi":"10.30630/joiv.7.2.1504","DOIUrl":null,"url":null,"abstract":"Human Activity Recognition (HAR) is an interesting research topic, especially in identifying human movement actions focusing on video-based security surveillance. Symptom of an illness from a movement. The use of HAR in this research is the key to better understanding the various semantics contained in the video to find out the pattern of a human movement, especially in sports movements. In this study, a combination of the CNN and LSTM method algorithms was applied by using several variations of the model parameter values on the dropout layer and batch size to convert the pattern in the video into image form to produce a HAR model. Data processing at the convolution layer is used to extract spatial features in the frame. The extraction results are fed to the LSTM layer on each network for modeling the temporal sequence of human movement. In this way, the network on the model will learn spatiotemporal features directly in end-to-end data training tests to produce a robust model. The test data used are 10 sports activities obtained from related research from the University of Central Florida (UCF). The results showed that the performance was quite good, although there were still errors in the classification of sports activities because they had similarities in the movements of the activities carried out. The classification results show a loss value of 0.4 and an accuracy of 0.94. In further research, what needs to be corrected is the loss value which is still high so that several times the test results show an error in the classification of sports activities that have similarities in the movements of the activities.","PeriodicalId":32468,"journal":{"name":"JOIV International Journal on Informatics Visualization","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JOIV International Journal on Informatics Visualization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30630/joiv.7.2.1504","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Decision Sciences","Score":null,"Total":0}

引用次数: 0

Abstract

Human Activity Recognition (HAR) is an interesting research topic, especially in identifying human movement actions focusing on video-based security surveillance. Symptom of an illness from a movement. The use of HAR in this research is the key to better understanding the various semantics contained in the video to find out the pattern of a human movement, especially in sports movements. In this study, a combination of the CNN and LSTM method algorithms was applied by using several variations of the model parameter values on the dropout layer and batch size to convert the pattern in the video into image form to produce a HAR model. Data processing at the convolution layer is used to extract spatial features in the frame. The extraction results are fed to the LSTM layer on each network for modeling the temporal sequence of human movement. In this way, the network on the model will learn spatiotemporal features directly in end-to-end data training tests to produce a robust model. The test data used are 10 sports activities obtained from related research from the University of Central Florida (UCF). The results showed that the performance was quite good, although there were still errors in the classification of sports activities because they had similarities in the movements of the activities carried out. The classification results show a loss value of 0.4 and an accuracy of 0.94. In further research, what needs to be corrected is the loss value which is still high so that several times the test results show an error in the classification of sports activities that have similarities in the movements of the activities.

查看原文本刊更多论文

基于视觉处理视频的卷积神经网络和长短期记忆算法在人体活动识别中的实现

人体活动识别(HAR)是一个有趣的研究课题，特别是在基于视频的安全监控中对人体运动行为的识别。运动引起的疾病症状在本研究中使用HAR是更好地理解视频中包含的各种语义以找出人体运动模式的关键，特别是在运动运动中。在本研究中，采用CNN和LSTM方法算法相结合的方法，利用dropout层上模型参数值的几种变化和批大小，将视频中的模式转换为图像形式，生成HAR模型。卷积层的数据处理用于提取帧中的空间特征。将提取结果馈送到每个网络上的LSTM层，用于建模人体运动的时间序列。这样，模型上的网络将在端到端数据训练测试中直接学习时空特征，从而产生一个鲁棒模型。所使用的测试数据来自中佛罗里达大学(UCF)相关研究的10项体育活动。结果表明，虽然在体育活动的分类上仍然存在错误，但由于所进行的活动的动作有相似之处，因此成绩相当不错。分类结果显示，损失值为0.4，准确率为0.94。在进一步的研究中，需要纠正的是损失值仍然很高，以至于几次测试结果显示在运动动作相似的体育活动分类上存在错误。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊