{"title":"ADL识别的两种深度方法:多尺度LSTM和具有三维矩阵骨架表示的CNN-LSTM","authors":"Giovanni Ercolano, D. Riccio, Silvia Rossi","doi":"10.1109/ROMAN.2017.8172406","DOIUrl":null,"url":null,"abstract":"In this work, we propose a deep learning approach for the detection of the activities of daily living (ADL) in a home environment starting from the skeleton data of an RGB-D camera. In this context, the combination of ad hoc features extraction/selection algorithms with supervised classification approaches has reached an excellent classification performance in the literature. Since the recurrent neural networks (RNNs) can learn temporal dependencies from instances with a periodic pattern, we propose two deep learning architectures based on Long Short-Term Memory (LSTM) networks. The first (MT-LSTM) combines three LSTMs deployed to learn different time-scale dependencies from pre-processed skeleton data. The second (CNN-LSTM) exploits the use of a Convolutional Neural Network (CNN) to automatically extract features by the correlation of the limbs in a skeleton 3D-grid representation. These models are tested on the CAD-60 dataset. Results show that the CNN-LSTM model outperforms the state-of-the-art performance with 95.4% of precision and 94.4% of recall.","PeriodicalId":134777,"journal":{"name":"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Two deep approaches for ADL recognition: A multi-scale LSTM and a CNN-LSTM with a 3D matrix skeleton representation\",\"authors\":\"Giovanni Ercolano, D. Riccio, Silvia Rossi\",\"doi\":\"10.1109/ROMAN.2017.8172406\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we propose a deep learning approach for the detection of the activities of daily living (ADL) in a home environment starting from the skeleton data of an RGB-D camera. In this context, the combination of ad hoc features extraction/selection algorithms with supervised classification approaches has reached an excellent classification performance in the literature. Since the recurrent neural networks (RNNs) can learn temporal dependencies from instances with a periodic pattern, we propose two deep learning architectures based on Long Short-Term Memory (LSTM) networks. The first (MT-LSTM) combines three LSTMs deployed to learn different time-scale dependencies from pre-processed skeleton data. The second (CNN-LSTM) exploits the use of a Convolutional Neural Network (CNN) to automatically extract features by the correlation of the limbs in a skeleton 3D-grid representation. These models are tested on the CAD-60 dataset. Results show that the CNN-LSTM model outperforms the state-of-the-art performance with 95.4% of precision and 94.4% of recall.\",\"PeriodicalId\":134777,\"journal\":{\"name\":\"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)\",\"volume\":\"104 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROMAN.2017.8172406\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROMAN.2017.8172406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Two deep approaches for ADL recognition: A multi-scale LSTM and a CNN-LSTM with a 3D matrix skeleton representation
In this work, we propose a deep learning approach for the detection of the activities of daily living (ADL) in a home environment starting from the skeleton data of an RGB-D camera. In this context, the combination of ad hoc features extraction/selection algorithms with supervised classification approaches has reached an excellent classification performance in the literature. Since the recurrent neural networks (RNNs) can learn temporal dependencies from instances with a periodic pattern, we propose two deep learning architectures based on Long Short-Term Memory (LSTM) networks. The first (MT-LSTM) combines three LSTMs deployed to learn different time-scale dependencies from pre-processed skeleton data. The second (CNN-LSTM) exploits the use of a Convolutional Neural Network (CNN) to automatically extract features by the correlation of the limbs in a skeleton 3D-grid representation. These models are tested on the CAD-60 dataset. Results show that the CNN-LSTM model outperforms the state-of-the-art performance with 95.4% of precision and 94.4% of recall.