ADL识别的两种深度方法:多尺度LSTM和具有三维矩阵骨架表示的CNN-LSTM

2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) Pub Date : 2017-08-01 DOI:10.1109/ROMAN.2017.8172406

Giovanni Ercolano, D. Riccio, Silvia Rossi

{"title":"ADL识别的两种深度方法:多尺度LSTM和具有三维矩阵骨架表示的CNN-LSTM","authors":"Giovanni Ercolano, D. Riccio, Silvia Rossi","doi":"10.1109/ROMAN.2017.8172406","DOIUrl":null,"url":null,"abstract":"In this work, we propose a deep learning approach for the detection of the activities of daily living (ADL) in a home environment starting from the skeleton data of an RGB-D camera. In this context, the combination of ad hoc features extraction/selection algorithms with supervised classification approaches has reached an excellent classification performance in the literature. Since the recurrent neural networks (RNNs) can learn temporal dependencies from instances with a periodic pattern, we propose two deep learning architectures based on Long Short-Term Memory (LSTM) networks. The first (MT-LSTM) combines three LSTMs deployed to learn different time-scale dependencies from pre-processed skeleton data. The second (CNN-LSTM) exploits the use of a Convolutional Neural Network (CNN) to automatically extract features by the correlation of the limbs in a skeleton 3D-grid representation. These models are tested on the CAD-60 dataset. Results show that the CNN-LSTM model outperforms the state-of-the-art performance with 95.4% of precision and 94.4% of recall.","PeriodicalId":134777,"journal":{"name":"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Two deep approaches for ADL recognition: A multi-scale LSTM and a CNN-LSTM with a 3D matrix skeleton representation\",\"authors\":\"Giovanni Ercolano, D. Riccio, Silvia Rossi\",\"doi\":\"10.1109/ROMAN.2017.8172406\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we propose a deep learning approach for the detection of the activities of daily living (ADL) in a home environment starting from the skeleton data of an RGB-D camera. In this context, the combination of ad hoc features extraction/selection algorithms with supervised classification approaches has reached an excellent classification performance in the literature. Since the recurrent neural networks (RNNs) can learn temporal dependencies from instances with a periodic pattern, we propose two deep learning architectures based on Long Short-Term Memory (LSTM) networks. The first (MT-LSTM) combines three LSTMs deployed to learn different time-scale dependencies from pre-processed skeleton data. The second (CNN-LSTM) exploits the use of a Convolutional Neural Network (CNN) to automatically extract features by the correlation of the limbs in a skeleton 3D-grid representation. These models are tested on the CAD-60 dataset. Results show that the CNN-LSTM model outperforms the state-of-the-art performance with 95.4% of precision and 94.4% of recall.\",\"PeriodicalId\":134777,\"journal\":{\"name\":\"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)\",\"volume\":\"104 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROMAN.2017.8172406\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROMAN.2017.8172406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 22

摘要

在这项工作中，我们提出了一种深度学习方法，用于从RGB-D相机的骨架数据开始检测家庭环境中的日常生活活动(ADL)。在此背景下，将ad hoc特征提取/选择算法与监督分类方法相结合，在文献中取得了优异的分类性能。由于递归神经网络(rnn)可以从具有周期性模式的实例中学习时间依赖性，我们提出了两种基于长短期记忆(LSTM)网络的深度学习架构。第一个(MT-LSTM)结合了部署的三个lstm，以从预处理的骨架数据中学习不同的时间尺度依赖关系。第二种(CNN- lstm)利用卷积神经网络(CNN)在骨骼3d网格表示中通过肢体的相关性自动提取特征。这些模型在CAD-60数据集上进行了测试。结果表明，CNN-LSTM模型以95.4%的准确率和94.4%的召回率优于最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Two deep approaches for ADL recognition: A multi-scale LSTM and a CNN-LSTM with a 3D matrix skeleton representation

In this work, we propose a deep learning approach for the detection of the activities of daily living (ADL) in a home environment starting from the skeleton data of an RGB-D camera. In this context, the combination of ad hoc features extraction/selection algorithms with supervised classification approaches has reached an excellent classification performance in the literature. Since the recurrent neural networks (RNNs) can learn temporal dependencies from instances with a periodic pattern, we propose two deep learning architectures based on Long Short-Term Memory (LSTM) networks. The first (MT-LSTM) combines three LSTMs deployed to learn different time-scale dependencies from pre-processed skeleton data. The second (CNN-LSTM) exploits the use of a Convolutional Neural Network (CNN) to automatically extract features by the correlation of the limbs in a skeleton 3D-grid representation. These models are tested on the CAD-60 dataset. Results show that the CNN-LSTM model outperforms the state-of-the-art performance with 95.4% of precision and 94.4% of recall.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)

自引率

0.00%

发文量