{"title":"Sequential learning for multimodal 3D human activity recognition with Long-Short Term Memory","authors":"Kang Li, Xiaoguang Zhao, Jiang Bian, M. Tan","doi":"10.1109/ICMA.2017.8016048","DOIUrl":null,"url":null,"abstract":"Capability of recognizing human activities is essential to human robot interaction for an intelligent robot. Traditional methods generally rely on hand-crafted features, which is not strong and accurate enough. In this paper, we present a feature self-learning mechanism for human activity recognition by using three-layer Long Short Term Memory (LSTM) to model long-term contextual information of temporal skeleton sequences for human activities which are represented by the trajectories of skeleton joints. Moreover, we add dropout mechanism and L2 regularization to the output of the three-layer Long Short Term Memory (LSTM) to avoid overfitting, and obtain better representation for feature modeling. Experimental results on a publicly available UTD multimodal human activity dataset demonstrate the effectiveness of the proposed recognition method.","PeriodicalId":124642,"journal":{"name":"2017 IEEE International Conference on Mechatronics and Automation (ICMA)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Mechatronics and Automation (ICMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMA.2017.8016048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20
Abstract
Capability of recognizing human activities is essential to human robot interaction for an intelligent robot. Traditional methods generally rely on hand-crafted features, which is not strong and accurate enough. In this paper, we present a feature self-learning mechanism for human activity recognition by using three-layer Long Short Term Memory (LSTM) to model long-term contextual information of temporal skeleton sequences for human activities which are represented by the trajectories of skeleton joints. Moreover, we add dropout mechanism and L2 regularization to the output of the three-layer Long Short Term Memory (LSTM) to avoid overfitting, and obtain better representation for feature modeling. Experimental results on a publicly available UTD multimodal human activity dataset demonstrate the effectiveness of the proposed recognition method.