Human gesture recognition using 3.5-dimensional trajectory features for hands-free user interface

ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream Pub Date : 2010-10-29 DOI:10.1145/1877868.1877872

Masaki Takahashi, Mahito Fujii, M. Naemura, S. Satoh

{"title":"Human gesture recognition using 3.5-dimensional trajectory features for hands-free user interface","authors":"Masaki Takahashi, Mahito Fujii, M. Naemura, S. Satoh","doi":"10.1145/1877868.1877872","DOIUrl":null,"url":null,"abstract":"We present a new human motion recognition technique for a hands-free user interface. Although many motion recognition technologies for video sequences have been reported, no man-machine interface that recognizes enough variety of motions has been developed. The difficulty was the lack of spatial information that could be acquired from video sequences captured by a normal camera. The proposed system uses a depth image in addition to a normal grayscale image from a time-of-flight camera that measures the depth to objects, so various motions are accurately recognized. The main functions of this system are gesture recognition and posture measurement. The former is performed using the bag-of-words approach. The trajectories of tracked key points around the human body are used as features in this approach. The main technical contribution of the proposed method is the use of 3.5D spatiotemporal trajectory features, which contain horizontal, vertical, time, and depth information. The latter is obtained through face detection and object tracking technology. The proposed user interface is useful and natural because it does not require any contact-type devices, such as a motion sensor controller. The effectiveness of the proposed 3.5D spatiotemporal features was confirmed through a comparative experiment with conventional 3.0D spatiotemporal features. The generality of the system was proven by an experiment with multiple people. The usefulness of the system as a pointing device was also proven by a practical simulation.","PeriodicalId":360789,"journal":{"name":"ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1877868.1877872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

We present a new human motion recognition technique for a hands-free user interface. Although many motion recognition technologies for video sequences have been reported, no man-machine interface that recognizes enough variety of motions has been developed. The difficulty was the lack of spatial information that could be acquired from video sequences captured by a normal camera. The proposed system uses a depth image in addition to a normal grayscale image from a time-of-flight camera that measures the depth to objects, so various motions are accurately recognized. The main functions of this system are gesture recognition and posture measurement. The former is performed using the bag-of-words approach. The trajectories of tracked key points around the human body are used as features in this approach. The main technical contribution of the proposed method is the use of 3.5D spatiotemporal trajectory features, which contain horizontal, vertical, time, and depth information. The latter is obtained through face detection and object tracking technology. The proposed user interface is useful and natural because it does not require any contact-type devices, such as a motion sensor controller. The effectiveness of the proposed 3.5D spatiotemporal features was confirmed through a comparative experiment with conventional 3.0D spatiotemporal features. The generality of the system was proven by an experiment with multiple people. The usefulness of the system as a pointing device was also proven by a practical simulation.

查看原文本刊更多论文

使用3.5维轨迹特征的人类手势识别，用于免提用户界面

我们提出了一种新的人体动作识别技术，用于免提用户界面。尽管已经报道了许多视频序列的动作识别技术，但尚未开发出能够识别足够多种动作的人机界面。困难在于缺乏可以从普通摄像机拍摄的视频序列中获得的空间信息。该系统除了使用来自飞行时间相机的普通灰度图像外，还使用深度图像来测量物体的深度，因此可以准确识别各种运动。该系统的主要功能是手势识别和姿态测量。前者是使用词袋方法完成的。在这种方法中，跟踪的关键点在人体周围的轨迹被用作特征。该方法的主要技术贡献是利用了3.5D时空轨迹特征，该特征包含水平、垂直、时间和深度信息。后者是通过人脸检测和目标跟踪技术获得的。拟议的用户界面是有用的和自然的，因为它不需要任何接触类型的设备，如运动传感器控制器。通过与常规3.0D时空特征的对比实验，验证了所提3.5D时空特征的有效性。通过多人实验证明了该系统的通用性。通过实际仿真验证了该系统作为指向装置的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream

自引率

0.00%

发文量