使用深度学习方法的人类行为识别

2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ) Pub Date : 2020-11-25 DOI:10.1109/IVCNZ51579.2020.9290594

Zeqi Yu, W. Yan

{"title":"使用深度学习方法的人类行为识别","authors":"Zeqi Yu, W. Yan","doi":"10.1109/IVCNZ51579.2020.9290594","DOIUrl":null,"url":null,"abstract":"The goal of human action recognition is to identify and understand the actions of people in videos and export corresponding tags. In addition to spatial correlation existing in 2D images, actions in a video also own the attributes in temporal domain. Due to the complexity of human actions, e.g., the changes of perspectives, background noises, and others will affect the recognition. In order to solve these thorny problems, three algorithms are designed and implemented in this paper. Based on convolutional neural networks (CNN), Two-Stream CNN, CNN+LSTM, and 3D CNN are harnessed to identify human actions in videos. Each algorithm is explicated and analyzed on details. HMDB-51 dataset is applied to test these algorithms and gain the best results. Experimental results showcase that the three methods have effectively identified human actions given a video, the best algorithm thus is selected.","PeriodicalId":164317,"journal":{"name":"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"154 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Human Action Recognition Using Deep Learning Methods\",\"authors\":\"Zeqi Yu, W. Yan\",\"doi\":\"10.1109/IVCNZ51579.2020.9290594\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The goal of human action recognition is to identify and understand the actions of people in videos and export corresponding tags. In addition to spatial correlation existing in 2D images, actions in a video also own the attributes in temporal domain. Due to the complexity of human actions, e.g., the changes of perspectives, background noises, and others will affect the recognition. In order to solve these thorny problems, three algorithms are designed and implemented in this paper. Based on convolutional neural networks (CNN), Two-Stream CNN, CNN+LSTM, and 3D CNN are harnessed to identify human actions in videos. Each algorithm is explicated and analyzed on details. HMDB-51 dataset is applied to test these algorithms and gain the best results. Experimental results showcase that the three methods have effectively identified human actions given a video, the best algorithm thus is selected.\",\"PeriodicalId\":164317,\"journal\":{\"name\":\"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)\",\"volume\":\"154 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IVCNZ51579.2020.9290594\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IVCNZ51579.2020.9290594","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

摘要

人的动作识别的目标是识别和理解视频中人的动作，并输出相应的标签。视频中的动作除了在二维图像中存在空间相关性外，还具有时域属性。由于人类行为的复杂性，例如视角的变化、背景噪声等都会影响识别。为了解决这些棘手的问题，本文设计并实现了三种算法。基于卷积神经网络(CNN)，利用Two-Stream CNN、CNN+LSTM和3D CNN来识别视频中的人类行为。对每一种算法进行了详细的阐述和分析。利用HMDB-51数据集对这些算法进行了测试，获得了最佳结果。实验结果表明，三种方法都能有效识别给定视频中的人类动作，从而选出最佳算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Human Action Recognition Using Deep Learning Methods

The goal of human action recognition is to identify and understand the actions of people in videos and export corresponding tags. In addition to spatial correlation existing in 2D images, actions in a video also own the attributes in temporal domain. Due to the complexity of human actions, e.g., the changes of perspectives, background noises, and others will affect the recognition. In order to solve these thorny problems, three algorithms are designed and implemented in this paper. Based on convolutional neural networks (CNN), Two-Stream CNN, CNN+LSTM, and 3D CNN are harnessed to identify human actions in videos. Each algorithm is explicated and analyzed on details. HMDB-51 dataset is applied to test these algorithms and gain the best results. Experimental results showcase that the three methods have effectively identified human actions given a video, the best algorithm thus is selected.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)

自引率

0.00%

发文量