Learning informative pairwise joints with energy-based temporal pyramid for 3D action recognition

Mengyuan Liu, Chen Chen, Hong Liu
{"title":"Learning informative pairwise joints with energy-based temporal pyramid for 3D action recognition","authors":"Mengyuan Liu, Chen Chen, Hong Liu","doi":"10.1109/ICME.2017.8019313","DOIUrl":null,"url":null,"abstract":"This paper presents an effective local spatial-temporal descriptor for action recognition from skeleton sequences. The unique property of our descriptor is that it takes the spatial-temporal discrimination and action speed variations into account, intending to solve the problems of distinguishing similar actions and identifying actions with different speeds in one goal. The entire algorithm consists of two stages. First, a frame selection method is used to remove noisy skeletons for a given skeleton sequence. From the selected skeletons, skeleton joints are mapped to a high dimensional space, where each point refers to kinematics, time label and joint label of a skeleton joint. To encode relative relationships among joints, pairwise points from the space are then jointly mapped to a new space, where each point encodes the relative relationships of skeleton joints. Second, Fisher Vector (FV) is employed to encode all points from the new space as a compact feature representation. To cope with speed variations in actions, an energy-based temporal pyramid is applied to form a multi-temporal FV representation, which is fed into a kernel-based extreme learning machine classifier for recognition. Extensive experiments on benchmark datasets consistently show that our method outperforms state-of-the-art approaches for skeleton-based action recognition.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Multimedia and Expo (ICME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2017.8019313","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

This paper presents an effective local spatial-temporal descriptor for action recognition from skeleton sequences. The unique property of our descriptor is that it takes the spatial-temporal discrimination and action speed variations into account, intending to solve the problems of distinguishing similar actions and identifying actions with different speeds in one goal. The entire algorithm consists of two stages. First, a frame selection method is used to remove noisy skeletons for a given skeleton sequence. From the selected skeletons, skeleton joints are mapped to a high dimensional space, where each point refers to kinematics, time label and joint label of a skeleton joint. To encode relative relationships among joints, pairwise points from the space are then jointly mapped to a new space, where each point encodes the relative relationships of skeleton joints. Second, Fisher Vector (FV) is employed to encode all points from the new space as a compact feature representation. To cope with speed variations in actions, an energy-based temporal pyramid is applied to form a multi-temporal FV representation, which is fed into a kernel-based extreme learning machine classifier for recognition. Extensive experiments on benchmark datasets consistently show that our method outperforms state-of-the-art approaches for skeleton-based action recognition.
基于能量的时间金字塔学习信息配对关节,用于三维动作识别
本文提出了一种有效的局部时空描述符,用于骨骼序列的动作识别。我们的描述符的独特之处在于它考虑了时空区分和动作速度变化,旨在解决在一个目标中区分相似动作和识别不同速度动作的问题。整个算法分为两个阶段。首先,对给定骨架序列采用帧选择方法去除噪声骨架;从选定的骨架中,将骨架关节映射到高维空间,其中每个点表示骨架关节的运动学、时间标签和关节标签。为了编码关节之间的相对关系,然后将空间中的成对点联合映射到一个新的空间中,其中每个点编码骨架关节的相对关系。其次,利用Fisher向量(FV)将新空间中的所有点编码为紧凑的特征表示。为了应对动作的速度变化,采用基于能量的时间金字塔来形成多时间FV表示,并将其输入到基于核的极限学习机分类器中进行识别。在基准数据集上的大量实验一致表明,我们的方法优于最先进的基于骨架的动作识别方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信