从单目图像序列中识别复杂的参数化手势

Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots Pub Date : 2008-12-01 DOI:10.1109/ICHR.2008.4755973

Tobias Axenbeck, Maren Bennewitz, Sven Behnke, Wolfram Burgard

{"title":"从单目图像序列中识别复杂的参数化手势","authors":"Tobias Axenbeck, Maren Bennewitz, Sven Behnke, Wolfram Burgard","doi":"10.1109/ICHR.2008.4755973","DOIUrl":null,"url":null,"abstract":"Robotic assistants designed to coexist and communicate with humans in the real world should be able to interact with them in an intuitive way. This requires that the robots are able to recognize typical gestures performed by humans such as head shaking/nodding, hand waving, or pointing. In this paper, we present a system that is able to spot and recognize complex, parameterized gestures from monocular image sequences. To represent people, we locate their faces and hands using trained classifiers and track them over time. We use few, expressive features extracted out of this compact representation as input to hidden Markov models (HMMs). First, we segment gestures into distinct phases and train HMMs for each phase separately. Then, we construct composed HMMs, which consist of the individual phase-HMMs. Once a specific phase is recognized, we estimate the parameter of the current gesture, e.g., the target of a pointing gesture. As we demonstrate in the experiments, our method is able to robustly locate and track hands, despite of the fact that they can take a large number of substantially different shapes. Based on this, our system is able to reliably spot and recognize a variety of complex, parameterized gestures.","PeriodicalId":402020,"journal":{"name":"Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots","volume":"370 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Recognizing complex, parameterized gestures from monocular image sequences\",\"authors\":\"Tobias Axenbeck, Maren Bennewitz, Sven Behnke, Wolfram Burgard\",\"doi\":\"10.1109/ICHR.2008.4755973\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Robotic assistants designed to coexist and communicate with humans in the real world should be able to interact with them in an intuitive way. This requires that the robots are able to recognize typical gestures performed by humans such as head shaking/nodding, hand waving, or pointing. In this paper, we present a system that is able to spot and recognize complex, parameterized gestures from monocular image sequences. To represent people, we locate their faces and hands using trained classifiers and track them over time. We use few, expressive features extracted out of this compact representation as input to hidden Markov models (HMMs). First, we segment gestures into distinct phases and train HMMs for each phase separately. Then, we construct composed HMMs, which consist of the individual phase-HMMs. Once a specific phase is recognized, we estimate the parameter of the current gesture, e.g., the target of a pointing gesture. As we demonstrate in the experiments, our method is able to robustly locate and track hands, despite of the fact that they can take a large number of substantially different shapes. Based on this, our system is able to reliably spot and recognize a variety of complex, parameterized gestures.\",\"PeriodicalId\":402020,\"journal\":{\"name\":\"Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots\",\"volume\":\"370 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICHR.2008.4755973\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICHR.2008.4755973","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

摘要

设计用于在现实世界中与人类共存和交流的机器人助手应该能够以一种直观的方式与人类互动。这要求机器人能够识别人类的典型手势，如摇头/点头、挥手或指向。在本文中，我们提出了一个能够从单目图像序列中识别复杂的参数化手势的系统。为了表示人，我们使用训练过的分类器定位他们的脸和手，并随着时间的推移进行跟踪。我们使用从这个紧凑的表示中提取的少数具有表现力的特征作为隐马尔可夫模型(hmm)的输入。首先，我们将手势分割成不同的阶段，并针对每个阶段分别训练hmm。然后，我们构建了由单个相位hmm组成的组合hmm。一旦识别出特定的相位，我们就可以估计当前手势的参数，例如，指向手势的目标。正如我们在实验中所展示的那样，我们的方法能够稳健地定位和跟踪手，尽管事实上它们可以采取大量不同的形状。基于此，我们的系统能够可靠地发现和识别各种复杂的参数化手势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Recognizing complex, parameterized gestures from monocular image sequences

Robotic assistants designed to coexist and communicate with humans in the real world should be able to interact with them in an intuitive way. This requires that the robots are able to recognize typical gestures performed by humans such as head shaking/nodding, hand waving, or pointing. In this paper, we present a system that is able to spot and recognize complex, parameterized gestures from monocular image sequences. To represent people, we locate their faces and hands using trained classifiers and track them over time. We use few, expressive features extracted out of this compact representation as input to hidden Markov models (HMMs). First, we segment gestures into distinct phases and train HMMs for each phase separately. Then, we construct composed HMMs, which consist of the individual phase-HMMs. Once a specific phase is recognized, we estimate the parameter of the current gesture, e.g., the target of a pointing gesture. As we demonstrate in the experiments, our method is able to robustly locate and track hands, despite of the fact that they can take a large number of substantially different shapes. Based on this, our system is able to reliably spot and recognize a variety of complex, parameterized gestures.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots

自引率

0.00%

发文量