Human Action Recognition Based on Vision Transformer and L2 Regularization

Qiliang Chen, Hasiqidalatu Tang, Jia-xin Cai
{"title":"Human Action Recognition Based on Vision Transformer and L2 Regularization","authors":"Qiliang Chen, Hasiqidalatu Tang, Jia-xin Cai","doi":"10.1145/3581807.3581840","DOIUrl":null,"url":null,"abstract":"In recent years, the field of human action recognition has been the focus of computer vision, and human action recognition has a good prospect in many fields, such as security state monitoring, behavior characteristics analysis and network video image restoration. In this paper, based on attention mechanism of human action recognition method is studied, in order to improve the model accuracy and efficiency in VIT network structure as the framework of feature extraction, because video data includes characteristics of time and space, so choose the space and time attention mechanism instead of the traditional convolution network for feature extraction, In addition, L2 weight attenuation regularization is introduced in model training to prevent the model from overfitting the training data. Through the test on the human action related dataset UCF101, it is found that the proposed model can effectively improve the recognition accuracy compared with other models.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"132 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3581807.3581840","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, the field of human action recognition has been the focus of computer vision, and human action recognition has a good prospect in many fields, such as security state monitoring, behavior characteristics analysis and network video image restoration. In this paper, based on attention mechanism of human action recognition method is studied, in order to improve the model accuracy and efficiency in VIT network structure as the framework of feature extraction, because video data includes characteristics of time and space, so choose the space and time attention mechanism instead of the traditional convolution network for feature extraction, In addition, L2 weight attenuation regularization is introduced in model training to prevent the model from overfitting the training data. Through the test on the human action related dataset UCF101, it is found that the proposed model can effectively improve the recognition accuracy compared with other models.
基于视觉变换和L2正则化的人体动作识别
近年来,人体动作识别领域一直是计算机视觉的研究热点,人体动作识别在安全状态监控、行为特征分析、网络视频图像恢复等诸多领域都有很好的应用前景。本文对基于注意机制的人体动作识别方法进行了研究,为了提高模型的准确性和效率,以VIT网络结构为框架进行特征提取,由于视频数据包含时间和空间的特征,因此选择了时空注意机制代替传统的卷积网络进行特征提取。在模型训练中引入L2权值衰减正则化,防止模型与训练数据过拟合。通过在人体动作相关数据集UCF101上的测试,发现与其他模型相比,所提出的模型可以有效地提高识别精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信