{"title":"基于稀疏自编码器和方向梯度直方图的人体动作识别","authors":"Pooi Shiang Tan, K. Lim, C. Lee","doi":"10.1109/IICAIET49801.2020.9257863","DOIUrl":null,"url":null,"abstract":"This paper presents a video-based human action recognition method leveraging deep learning model. Prior to the filtering phase, the input images are pre-processed by converting them into grayscale images. Thereafter, the region of interest that contains human performing action are cropped out by a pre-trained pedestrian detector. Next, the region of interest will be resized and passed as the input image to the filtering phase. In this phase, the filter kernels are trained using Sparse Autoencoder on the natural images. After obtaining the filter kernels, convolution operation is performed in the input image and the filter kernels. The filtered images are then passed to the feature extraction phase. The Histogram of Oriented Gradients descriptor is used to encode the local and global texture information of the filtered images. Lastly, in the classification phase, a Modified Hausdorff Distance is applied to classify the test sample to its nearest match based on the histograms. The performance of the deep learning algorithm is evaluated on three benchmark datasets, namely Weizmann Action Dataset, CAD-60 Dataset and Multimedia University (MMU) Human Action Dataset. The experimental results show that the proposed deep learning algorithm outperforms other methods on the Weizmann Dataset, CAD-60 Dataset and MMU Human Action Dataset with recognition rates of 100%, 88.24% and 99.5% respectively.","PeriodicalId":300885,"journal":{"name":"2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Human Action Recognition with Sparse Autoencoder and Histogram of Oriented Gradients\",\"authors\":\"Pooi Shiang Tan, K. Lim, C. Lee\",\"doi\":\"10.1109/IICAIET49801.2020.9257863\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a video-based human action recognition method leveraging deep learning model. Prior to the filtering phase, the input images are pre-processed by converting them into grayscale images. Thereafter, the region of interest that contains human performing action are cropped out by a pre-trained pedestrian detector. Next, the region of interest will be resized and passed as the input image to the filtering phase. In this phase, the filter kernels are trained using Sparse Autoencoder on the natural images. After obtaining the filter kernels, convolution operation is performed in the input image and the filter kernels. The filtered images are then passed to the feature extraction phase. The Histogram of Oriented Gradients descriptor is used to encode the local and global texture information of the filtered images. Lastly, in the classification phase, a Modified Hausdorff Distance is applied to classify the test sample to its nearest match based on the histograms. The performance of the deep learning algorithm is evaluated on three benchmark datasets, namely Weizmann Action Dataset, CAD-60 Dataset and Multimedia University (MMU) Human Action Dataset. The experimental results show that the proposed deep learning algorithm outperforms other methods on the Weizmann Dataset, CAD-60 Dataset and MMU Human Action Dataset with recognition rates of 100%, 88.24% and 99.5% respectively.\",\"PeriodicalId\":300885,\"journal\":{\"name\":\"2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IICAIET49801.2020.9257863\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IICAIET49801.2020.9257863","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Human Action Recognition with Sparse Autoencoder and Histogram of Oriented Gradients
This paper presents a video-based human action recognition method leveraging deep learning model. Prior to the filtering phase, the input images are pre-processed by converting them into grayscale images. Thereafter, the region of interest that contains human performing action are cropped out by a pre-trained pedestrian detector. Next, the region of interest will be resized and passed as the input image to the filtering phase. In this phase, the filter kernels are trained using Sparse Autoencoder on the natural images. After obtaining the filter kernels, convolution operation is performed in the input image and the filter kernels. The filtered images are then passed to the feature extraction phase. The Histogram of Oriented Gradients descriptor is used to encode the local and global texture information of the filtered images. Lastly, in the classification phase, a Modified Hausdorff Distance is applied to classify the test sample to its nearest match based on the histograms. The performance of the deep learning algorithm is evaluated on three benchmark datasets, namely Weizmann Action Dataset, CAD-60 Dataset and Multimedia University (MMU) Human Action Dataset. The experimental results show that the proposed deep learning algorithm outperforms other methods on the Weizmann Dataset, CAD-60 Dataset and MMU Human Action Dataset with recognition rates of 100%, 88.24% and 99.5% respectively.