{"title":"Two deep approaches for ADL recognition: A multi-scale LSTM and a CNN-LSTM with a 3D matrix skeleton representation","authors":"Giovanni Ercolano, D. Riccio, Silvia Rossi","doi":"10.1109/ROMAN.2017.8172406","DOIUrl":null,"url":null,"abstract":"In this work, we propose a deep learning approach for the detection of the activities of daily living (ADL) in a home environment starting from the skeleton data of an RGB-D camera. In this context, the combination of ad hoc features extraction/selection algorithms with supervised classification approaches has reached an excellent classification performance in the literature. Since the recurrent neural networks (RNNs) can learn temporal dependencies from instances with a periodic pattern, we propose two deep learning architectures based on Long Short-Term Memory (LSTM) networks. The first (MT-LSTM) combines three LSTMs deployed to learn different time-scale dependencies from pre-processed skeleton data. The second (CNN-LSTM) exploits the use of a Convolutional Neural Network (CNN) to automatically extract features by the correlation of the limbs in a skeleton 3D-grid representation. These models are tested on the CAD-60 dataset. Results show that the CNN-LSTM model outperforms the state-of-the-art performance with 95.4% of precision and 94.4% of recall.","PeriodicalId":134777,"journal":{"name":"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROMAN.2017.8172406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
In this work, we propose a deep learning approach for the detection of the activities of daily living (ADL) in a home environment starting from the skeleton data of an RGB-D camera. In this context, the combination of ad hoc features extraction/selection algorithms with supervised classification approaches has reached an excellent classification performance in the literature. Since the recurrent neural networks (RNNs) can learn temporal dependencies from instances with a periodic pattern, we propose two deep learning architectures based on Long Short-Term Memory (LSTM) networks. The first (MT-LSTM) combines three LSTMs deployed to learn different time-scale dependencies from pre-processed skeleton data. The second (CNN-LSTM) exploits the use of a Convolutional Neural Network (CNN) to automatically extract features by the correlation of the limbs in a skeleton 3D-grid representation. These models are tested on the CAD-60 dataset. Results show that the CNN-LSTM model outperforms the state-of-the-art performance with 95.4% of precision and 94.4% of recall.