基于稀疏自编码器和方向梯度直方图的人体动作识别

Pooi Shiang Tan, K. Lim, C. Lee
{"title":"基于稀疏自编码器和方向梯度直方图的人体动作识别","authors":"Pooi Shiang Tan, K. Lim, C. Lee","doi":"10.1109/IICAIET49801.2020.9257863","DOIUrl":null,"url":null,"abstract":"This paper presents a video-based human action recognition method leveraging deep learning model. Prior to the filtering phase, the input images are pre-processed by converting them into grayscale images. Thereafter, the region of interest that contains human performing action are cropped out by a pre-trained pedestrian detector. Next, the region of interest will be resized and passed as the input image to the filtering phase. In this phase, the filter kernels are trained using Sparse Autoencoder on the natural images. After obtaining the filter kernels, convolution operation is performed in the input image and the filter kernels. The filtered images are then passed to the feature extraction phase. The Histogram of Oriented Gradients descriptor is used to encode the local and global texture information of the filtered images. Lastly, in the classification phase, a Modified Hausdorff Distance is applied to classify the test sample to its nearest match based on the histograms. The performance of the deep learning algorithm is evaluated on three benchmark datasets, namely Weizmann Action Dataset, CAD-60 Dataset and Multimedia University (MMU) Human Action Dataset. The experimental results show that the proposed deep learning algorithm outperforms other methods on the Weizmann Dataset, CAD-60 Dataset and MMU Human Action Dataset with recognition rates of 100%, 88.24% and 99.5% respectively.","PeriodicalId":300885,"journal":{"name":"2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Human Action Recognition with Sparse Autoencoder and Histogram of Oriented Gradients\",\"authors\":\"Pooi Shiang Tan, K. Lim, C. Lee\",\"doi\":\"10.1109/IICAIET49801.2020.9257863\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a video-based human action recognition method leveraging deep learning model. Prior to the filtering phase, the input images are pre-processed by converting them into grayscale images. Thereafter, the region of interest that contains human performing action are cropped out by a pre-trained pedestrian detector. Next, the region of interest will be resized and passed as the input image to the filtering phase. In this phase, the filter kernels are trained using Sparse Autoencoder on the natural images. After obtaining the filter kernels, convolution operation is performed in the input image and the filter kernels. The filtered images are then passed to the feature extraction phase. The Histogram of Oriented Gradients descriptor is used to encode the local and global texture information of the filtered images. Lastly, in the classification phase, a Modified Hausdorff Distance is applied to classify the test sample to its nearest match based on the histograms. The performance of the deep learning algorithm is evaluated on three benchmark datasets, namely Weizmann Action Dataset, CAD-60 Dataset and Multimedia University (MMU) Human Action Dataset. The experimental results show that the proposed deep learning algorithm outperforms other methods on the Weizmann Dataset, CAD-60 Dataset and MMU Human Action Dataset with recognition rates of 100%, 88.24% and 99.5% respectively.\",\"PeriodicalId\":300885,\"journal\":{\"name\":\"2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IICAIET49801.2020.9257863\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IICAIET49801.2020.9257863","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

本文提出了一种利用深度学习模型的基于视频的人体动作识别方法。在滤波阶段之前,通过将输入图像转换为灰度图像对其进行预处理。然后,通过预先训练的行人检测器裁剪出包含人类表演动作的感兴趣区域。接下来,感兴趣的区域将被调整大小,并作为输入图像传递到过滤阶段。在此阶段,使用稀疏自编码器对自然图像进行滤波核训练。得到滤波核后,对输入图像和滤波核进行卷积运算。然后将过滤后的图像传递到特征提取阶段。利用梯度直方图描述符对滤波后图像的局部和全局纹理信息进行编码。最后,在分类阶段,应用修正豪斯多夫距离(Modified Hausdorff Distance)根据直方图对测试样本进行最接近匹配的分类。在Weizmann动作数据集、CAD-60数据集和多媒体大学(MMU)人类动作数据集三个基准数据集上对深度学习算法的性能进行了评估。实验结果表明,本文提出的深度学习算法在Weizmann数据集、CAD-60数据集和MMU人类动作数据集上的识别率分别为100%、88.24%和99.5%,优于其他方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Human Action Recognition with Sparse Autoencoder and Histogram of Oriented Gradients
This paper presents a video-based human action recognition method leveraging deep learning model. Prior to the filtering phase, the input images are pre-processed by converting them into grayscale images. Thereafter, the region of interest that contains human performing action are cropped out by a pre-trained pedestrian detector. Next, the region of interest will be resized and passed as the input image to the filtering phase. In this phase, the filter kernels are trained using Sparse Autoencoder on the natural images. After obtaining the filter kernels, convolution operation is performed in the input image and the filter kernels. The filtered images are then passed to the feature extraction phase. The Histogram of Oriented Gradients descriptor is used to encode the local and global texture information of the filtered images. Lastly, in the classification phase, a Modified Hausdorff Distance is applied to classify the test sample to its nearest match based on the histograms. The performance of the deep learning algorithm is evaluated on three benchmark datasets, namely Weizmann Action Dataset, CAD-60 Dataset and Multimedia University (MMU) Human Action Dataset. The experimental results show that the proposed deep learning algorithm outperforms other methods on the Weizmann Dataset, CAD-60 Dataset and MMU Human Action Dataset with recognition rates of 100%, 88.24% and 99.5% respectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信