深度序列中人体动作识别关节形状运动线索的稀疏时空表示

Quang D. Tran, N. Ly
{"title":"深度序列中人体动作识别关节形状运动线索的稀疏时空表示","authors":"Quang D. Tran, N. Ly","doi":"10.1109/RIVF.2013.6719903","DOIUrl":null,"url":null,"abstract":"The availability of 3D sensors has recently made it possible to capture depth maps in real time, which simplifies a variety of visual recognition tasks, including object/action classification, 3D reconstruction, etc.We address here the problems of human action recognition in depth sequences. On one hand, we present a new joint shape-motion descriptor which we call as 3D Spherical Histogram of Oriented Normal Vectors (3DS-HONV), since it is a spatio-temporal extension of the original HONV quantized in a 3D spherical coordinate. We further prove that the Optical Flow fields in depth sequences could be used in conjunction with the presented descriptor to augment the ability of capturing in-plane movements; the experiments later show that this combination is more efficient than the standalone 3DS-HONV. In addition, a discriminative dictionary learning and feature representation via Sparse Coding is applied to proposed descriptors to relieve the intrinsic effects of noise and capture high-level patterns. By learning these sparse and distinctive representations, we demonstrate large improvements over the state-of-the-art on two challenging benchmarks, which results with an overall accuracy of 91.92% on the MSRAction3D and 93.31% on the MSRGesture3D datasets, respectively.","PeriodicalId":121216,"journal":{"name":"The 2013 RIVF International Conference on Computing & Communication Technologies - Research, Innovation, and Vision for Future (RIVF)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Sparse spatio-temporal representation of joint shape-motion cues for human action recognition in depth sequences\",\"authors\":\"Quang D. Tran, N. Ly\",\"doi\":\"10.1109/RIVF.2013.6719903\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The availability of 3D sensors has recently made it possible to capture depth maps in real time, which simplifies a variety of visual recognition tasks, including object/action classification, 3D reconstruction, etc.We address here the problems of human action recognition in depth sequences. On one hand, we present a new joint shape-motion descriptor which we call as 3D Spherical Histogram of Oriented Normal Vectors (3DS-HONV), since it is a spatio-temporal extension of the original HONV quantized in a 3D spherical coordinate. We further prove that the Optical Flow fields in depth sequences could be used in conjunction with the presented descriptor to augment the ability of capturing in-plane movements; the experiments later show that this combination is more efficient than the standalone 3DS-HONV. In addition, a discriminative dictionary learning and feature representation via Sparse Coding is applied to proposed descriptors to relieve the intrinsic effects of noise and capture high-level patterns. By learning these sparse and distinctive representations, we demonstrate large improvements over the state-of-the-art on two challenging benchmarks, which results with an overall accuracy of 91.92% on the MSRAction3D and 93.31% on the MSRGesture3D datasets, respectively.\",\"PeriodicalId\":121216,\"journal\":{\"name\":\"The 2013 RIVF International Conference on Computing & Communication Technologies - Research, Innovation, and Vision for Future (RIVF)\",\"volume\":\"115 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 2013 RIVF International Conference on Computing & Communication Technologies - Research, Innovation, and Vision for Future (RIVF)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RIVF.2013.6719903\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2013 RIVF International Conference on Computing & Communication Technologies - Research, Innovation, and Vision for Future (RIVF)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RIVF.2013.6719903","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

摘要

最近,3D传感器的可用性使得实时捕获深度图成为可能,这简化了各种视觉识别任务,包括对象/动作分类,3D重建等。我们在这里解决深度序列中人类动作识别的问题。一方面,我们提出了一种新的关节形状运动描述符,我们称之为三维球面定向法向量直方图(3dds -HONV),因为它是在三维球面坐标中量化的原始HONV的时空扩展。我们进一步证明了深度序列中的光流场可以与所提出的描述符结合使用,以增强捕捉平面内运动的能力;后来的实验表明,这种组合比单独的3d - honv更有效。此外,通过稀疏编码的判别字典学习和特征表示应用于所提出的描述符,以减轻噪声的内在影响并捕获高级模式。通过学习这些稀疏和独特的表示,我们在两个具有挑战性的基准测试中展示了对最先进的技术的巨大改进,其结果是MSRAction3D和MSRGesture3D数据集的总体精度分别为91.92%和93.31%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Sparse spatio-temporal representation of joint shape-motion cues for human action recognition in depth sequences
The availability of 3D sensors has recently made it possible to capture depth maps in real time, which simplifies a variety of visual recognition tasks, including object/action classification, 3D reconstruction, etc.We address here the problems of human action recognition in depth sequences. On one hand, we present a new joint shape-motion descriptor which we call as 3D Spherical Histogram of Oriented Normal Vectors (3DS-HONV), since it is a spatio-temporal extension of the original HONV quantized in a 3D spherical coordinate. We further prove that the Optical Flow fields in depth sequences could be used in conjunction with the presented descriptor to augment the ability of capturing in-plane movements; the experiments later show that this combination is more efficient than the standalone 3DS-HONV. In addition, a discriminative dictionary learning and feature representation via Sparse Coding is applied to proposed descriptors to relieve the intrinsic effects of noise and capture high-level patterns. By learning these sparse and distinctive representations, we demonstrate large improvements over the state-of-the-art on two challenging benchmarks, which results with an overall accuracy of 91.92% on the MSRAction3D and 93.31% on the MSRGesture3D datasets, respectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信