Energy-Based Global Ternary Image for Action Recognition Using Sole Depth Sequences

Mengyuan Liu, Hong Liu, Chen Chen, M. Najafian
{"title":"Energy-Based Global Ternary Image for Action Recognition Using Sole Depth Sequences","authors":"Mengyuan Liu, Hong Liu, Chen Chen, M. Najafian","doi":"10.1109/3DV.2016.14","DOIUrl":null,"url":null,"abstract":"In order to efficiently recognize actions from depth sequences, we propose a novel feature, called Global Ternary Image (GTI), which implicitly encodes both motion regions and motion directions between consecutive depth frames via recording the changes of depth pixels. In this study, each pixel in GTI indicates one of the three possible states, namely positive, negative and neutral, which represents increased, decreased and same depth values, respectively. Since GTI is sensitive to the subject's speed, we obtain energy-based GTI (E-GTI) by extracting GTI from pairwise depth frames with equal motion energy. To involve temporal information among depth frames, we extract E-GTI using multiple settings of motion energy. Here, the noise can be effectively suppressed by describing E-GTIs using the Radon Transform (RT). The 3D action representation is formed as a result of feeding the hierarchical combination of RTs to the Bag of Visual Words model (BoVW). From the extensive experiments on four benchmark datasets, namely MSRAction3D, DHA, MSRGesture3D and SKIG, it is evident that the hierarchical E-GTI outperforms the existing methods in 3D action recognition. We tested our proposed approach on extended MSRAction3D dataset to further investigate and verify its robustness against partial occlusions, noise and speed.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"181 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Fourth International Conference on 3D Vision (3DV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/3DV.2016.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

In order to efficiently recognize actions from depth sequences, we propose a novel feature, called Global Ternary Image (GTI), which implicitly encodes both motion regions and motion directions between consecutive depth frames via recording the changes of depth pixels. In this study, each pixel in GTI indicates one of the three possible states, namely positive, negative and neutral, which represents increased, decreased and same depth values, respectively. Since GTI is sensitive to the subject's speed, we obtain energy-based GTI (E-GTI) by extracting GTI from pairwise depth frames with equal motion energy. To involve temporal information among depth frames, we extract E-GTI using multiple settings of motion energy. Here, the noise can be effectively suppressed by describing E-GTIs using the Radon Transform (RT). The 3D action representation is formed as a result of feeding the hierarchical combination of RTs to the Bag of Visual Words model (BoVW). From the extensive experiments on four benchmark datasets, namely MSRAction3D, DHA, MSRGesture3D and SKIG, it is evident that the hierarchical E-GTI outperforms the existing methods in 3D action recognition. We tested our proposed approach on extended MSRAction3D dataset to further investigate and verify its robustness against partial occlusions, noise and speed.
基于能量的全局三元图像单一深度序列动作识别
为了有效地识别深度序列中的动作,我们提出了一种新的特征,称为全局三元图像(Global Ternary Image, GTI),它通过记录深度像素的变化来隐式地编码连续深度帧之间的运动区域和运动方向。在本研究中,GTI中的每个像素都代表三种可能状态中的一种,即正、负和中性,分别代表增加、减少和相同深度值。由于GTI对被测对象的速度非常敏感,我们从运动能量相等的两两深度帧中提取GTI,得到基于能量的GTI (E-GTI)。为了在深度帧之间包含时间信息,我们使用多种运动能量设置提取E-GTI。在这里,使用Radon变换(RT)描述e - gti可以有效地抑制噪声。3D动作表示是将RTs的分层组合提供给视觉词袋模型(BoVW)的结果。在MSRAction3D、DHA、MSRGesture3D和SKIG四个基准数据集上进行的大量实验表明,分层E-GTI在三维动作识别方面优于现有方法。我们在扩展的MSRAction3D数据集上测试了我们提出的方法,以进一步研究和验证其对部分遮挡、噪声和速度的鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信