{"title":"基于深度神经网络模型的多摄像头无标记人体活动识别方法","authors":"Prasetia Utama Putra, K. Shima, Koji Shimatani","doi":"10.1109/CoDIT.2018.8394780","DOIUrl":null,"url":null,"abstract":"Most methods of multi-view human activity recognition can be classified as conventional computer vision approaches. Those approaches separate feature descriptor and discriminator. Hence, the feature extractor cannot learn from the mistakes made by the classifier. In this paper, a deep neural network (DNN) model for human activity estimation using multi-view sequences of raw images is presented. This approach incorporates features extractor and discriminator into a single model. The model comprises three parts, a convolutional neural network (CNN) block, MSLSTMRes, and a dense layer. This method enables discrimination of human activity such as “walk” and “sit down” by merely using sequences of raw images. Experimental results on IXMAS dataset using one-subject cross validation demonstrates high prediction rate that is comparable to other methods in the literature, which utilized preprocessed images such as silhouette and volumetric data and sophisticated feature extractor.","PeriodicalId":128011,"journal":{"name":"2018 5th International Conference on Control, Decision and Information Technologies (CoDIT)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Markerless Human Activity Recognition Method Based on Deep Neural Network Model Using Multiple Cameras\",\"authors\":\"Prasetia Utama Putra, K. Shima, Koji Shimatani\",\"doi\":\"10.1109/CoDIT.2018.8394780\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most methods of multi-view human activity recognition can be classified as conventional computer vision approaches. Those approaches separate feature descriptor and discriminator. Hence, the feature extractor cannot learn from the mistakes made by the classifier. In this paper, a deep neural network (DNN) model for human activity estimation using multi-view sequences of raw images is presented. This approach incorporates features extractor and discriminator into a single model. The model comprises three parts, a convolutional neural network (CNN) block, MSLSTMRes, and a dense layer. This method enables discrimination of human activity such as “walk” and “sit down” by merely using sequences of raw images. Experimental results on IXMAS dataset using one-subject cross validation demonstrates high prediction rate that is comparable to other methods in the literature, which utilized preprocessed images such as silhouette and volumetric data and sophisticated feature extractor.\",\"PeriodicalId\":128011,\"journal\":{\"name\":\"2018 5th International Conference on Control, Decision and Information Technologies (CoDIT)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 5th International Conference on Control, Decision and Information Technologies (CoDIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CoDIT.2018.8394780\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 5th International Conference on Control, Decision and Information Technologies (CoDIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CoDIT.2018.8394780","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Markerless Human Activity Recognition Method Based on Deep Neural Network Model Using Multiple Cameras
Most methods of multi-view human activity recognition can be classified as conventional computer vision approaches. Those approaches separate feature descriptor and discriminator. Hence, the feature extractor cannot learn from the mistakes made by the classifier. In this paper, a deep neural network (DNN) model for human activity estimation using multi-view sequences of raw images is presented. This approach incorporates features extractor and discriminator into a single model. The model comprises three parts, a convolutional neural network (CNN) block, MSLSTMRes, and a dense layer. This method enables discrimination of human activity such as “walk” and “sit down” by merely using sequences of raw images. Experimental results on IXMAS dataset using one-subject cross validation demonstrates high prediction rate that is comparable to other methods in the literature, which utilized preprocessed images such as silhouette and volumetric data and sophisticated feature extractor.