Simplified Training for Gesture Recognition

2014 27th SIBGRAPI Conference on Graphics, Patterns and Images Pub Date : 2014-08-26 DOI:10.1109/SIBGRAPI.2014.46

Romain Faugeroux, Thales Vieira, D. M. Morera, T. Lewiner

{"title":"Simplified Training for Gesture Recognition","authors":"Romain Faugeroux, Thales Vieira, D. M. Morera, T. Lewiner","doi":"10.1109/SIBGRAPI.2014.46","DOIUrl":null,"url":null,"abstract":"Since gesture is a fundamental form of human communication, its recognition by a computer generates a strong interest for many applications such as natural user interface and gaming. The popularization of real-time depth sensors brings such applications to the public at large. However, familiar gestures are culture-specific, and their automatic recognition must therefore result from a machine learning process. In particular this requires either teaching the user how to communicate with the machine, such as for popular mobile devices or gaming consoles, or tailoring the application to a specific public. The latter option serves a large number of applications such as sport monitoring, virtual reality or surveillance -- although it requires a usually tedious training. This work intends to simplify the training required by gesture recognition methods. While the traditional procedure uses a set of key poses, which must be defined and trained, prior to a set of gestures that must also be defined and trained, we propose to automatically deduce the set of key poses from the gesture training. We represent a record of gestures as a curve in high dimension and robustly segment it according to the curvature of that curve. Then we use an asymmetric Hausdorff distance between gestures to define a discriminant key pose as the most distant pose between gestures. This further allows to dynamically group gestures by similarity. The training only requires the user to perform the gestures and eventually refine the gesture labeling. The generated set of key poses and gestures then fits in previous human action recognition algorithms. Furthermore, this semi-supervised learning allows re-using a previous training to extend the set of gestures the computer should be able to recognize. Experimental results show that the automatically generated discriminant key poses lead to similar recognition accuracy as previous work.","PeriodicalId":146229,"journal":{"name":"2014 27th SIBGRAPI Conference on Graphics, Patterns and Images","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 27th SIBGRAPI Conference on Graphics, Patterns and Images","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIBGRAPI.2014.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Since gesture is a fundamental form of human communication, its recognition by a computer generates a strong interest for many applications such as natural user interface and gaming. The popularization of real-time depth sensors brings such applications to the public at large. However, familiar gestures are culture-specific, and their automatic recognition must therefore result from a machine learning process. In particular this requires either teaching the user how to communicate with the machine, such as for popular mobile devices or gaming consoles, or tailoring the application to a specific public. The latter option serves a large number of applications such as sport monitoring, virtual reality or surveillance -- although it requires a usually tedious training. This work intends to simplify the training required by gesture recognition methods. While the traditional procedure uses a set of key poses, which must be defined and trained, prior to a set of gestures that must also be defined and trained, we propose to automatically deduce the set of key poses from the gesture training. We represent a record of gestures as a curve in high dimension and robustly segment it according to the curvature of that curve. Then we use an asymmetric Hausdorff distance between gestures to define a discriminant key pose as the most distant pose between gestures. This further allows to dynamically group gestures by similarity. The training only requires the user to perform the gestures and eventually refine the gesture labeling. The generated set of key poses and gestures then fits in previous human action recognition algorithms. Furthermore, this semi-supervised learning allows re-using a previous training to extend the set of gestures the computer should be able to recognize. Experimental results show that the automatically generated discriminant key poses lead to similar recognition accuracy as previous work.

查看原文本刊更多论文

手势识别的简化训练

由于手势是人类交流的基本形式，计算机对手势的识别对自然用户界面和游戏等许多应用产生了浓厚的兴趣。实时深度传感器的普及为大众带来了广泛的应用。然而，熟悉的手势是特定于文化的，因此它们的自动识别必须来自机器学习过程。特别是，这需要教会用户如何与机器通信，例如流行的移动设备或游戏机，或者为特定的公众定制应用程序。后一种选择适用于大量应用，如体育监控、虚拟现实或监视——尽管它通常需要繁琐的训练。这项工作旨在简化手势识别方法所需的训练。传统的程序使用一组必须定义和训练的关键姿势，而在一组必须定义和训练的手势之前，我们建议从手势训练中自动推断出一组关键姿势。我们将手势记录表示为高维曲线，并根据该曲线的曲率对其进行鲁棒分割。然后，我们使用手势之间的不对称Hausdorff距离来定义一个区别键姿态作为手势之间的最远距离姿态。这进一步允许根据相似性对手势进行动态分组。训练只需要用户执行手势并最终完善手势标记。生成的一组关键姿势和手势然后与先前的人类动作识别算法相匹配。此外，这种半监督学习允许重复使用以前的训练来扩展计算机应该能够识别的手势集。实验结果表明，自动生成的鉴别键姿识别精度与之前的方法相当。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 27th SIBGRAPI Conference on Graphics, Patterns and Images

自引率

0.00%

发文量