{"title":"Video Human Action Recognition Algorithm Based on Double Branch 3D-CNN","authors":"Yu Wang, Jiaxi Sun","doi":"10.1109/CISP-BMEI56279.2022.9979858","DOIUrl":null,"url":null,"abstract":"The traditional action recognition algorithm based on manual feature extraction is relatively complex and has low recognition accuracy. This paper presents a video action recognition algorithm based on double branch convolutional neural network, which includes two separate convolutional neural networks in sequence. The training network effectively extracts spatio-temporal features through 3D convolution and GRU layers. Then, the features extracted from the training network are input into the test network for classification. The accuracy of the proposed algorithm is 95.0% on UCF-101 dataset. By comparing with other benchmark methods, the accuracy and effectiveness of this method are verified.","PeriodicalId":198522,"journal":{"name":"2022 15th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 15th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISP-BMEI56279.2022.9979858","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The traditional action recognition algorithm based on manual feature extraction is relatively complex and has low recognition accuracy. This paper presents a video action recognition algorithm based on double branch convolutional neural network, which includes two separate convolutional neural networks in sequence. The training network effectively extracts spatio-temporal features through 3D convolution and GRU layers. Then, the features extracted from the training network are input into the test network for classification. The accuracy of the proposed algorithm is 95.0% on UCF-101 dataset. By comparing with other benchmark methods, the accuracy and effectiveness of this method are verified.