跟踪手关节:依赖3D视觉船体与依赖多个2D线索

2013 International Symposium on Ubiquitous Virtual Reality Pub Date : 2013-07-10 DOI:10.1109/ISUVR.2013.13

I. Oikonomidis, Nikolaos Kyriazis, Konstantinos Tzevanidis, Antonis A. Argyros

{"title":"跟踪手关节:依赖3D视觉船体与依赖多个2D线索","authors":"I. Oikonomidis, Nikolaos Kyriazis, Konstantinos Tzevanidis, Antonis A. Argyros","doi":"10.1109/ISUVR.2013.13","DOIUrl":null,"url":null,"abstract":"We present a method for articulated hand tracking that relies on visual input acquired by a calibrated multicamera system. A state-of-the-art result on this problem has been presented in [12]. In that work, hand tracking is formulated as the minimization of an objective function that quantifies the discrepancy between a hand pose hypothesis and the observations. The objective function treats the observations from each camera view in an independent way. We follow the same general optimization framework but we choose to employ the visual hull [10] as the main observation cue, which results from the integration of information from all available views prior to optimization. We investigate the behavior of the resulting method in extensive experiments and in comparison with that of [12]. The obtained results demonstrate that for low levels of noise contamination, regardless of the number of cameras, the two methods perform comparably. The situation changes when noisy observations or as few as two cameras with short baselines are employed. In these cases, the proposed method is more accurate than that of [12]. Thus, the proposed method is preferable in real-world scenarios with noisy observations obtained from easy-to-deploy, stereo camera setups.","PeriodicalId":299563,"journal":{"name":"2013 International Symposium on Ubiquitous Virtual Reality","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Tracking Hand Articulations: Relying on 3D Visual Hulls Versus Relying on Multiple 2D Cues\",\"authors\":\"I. Oikonomidis, Nikolaos Kyriazis, Konstantinos Tzevanidis, Antonis A. Argyros\",\"doi\":\"10.1109/ISUVR.2013.13\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a method for articulated hand tracking that relies on visual input acquired by a calibrated multicamera system. A state-of-the-art result on this problem has been presented in [12]. In that work, hand tracking is formulated as the minimization of an objective function that quantifies the discrepancy between a hand pose hypothesis and the observations. The objective function treats the observations from each camera view in an independent way. We follow the same general optimization framework but we choose to employ the visual hull [10] as the main observation cue, which results from the integration of information from all available views prior to optimization. We investigate the behavior of the resulting method in extensive experiments and in comparison with that of [12]. The obtained results demonstrate that for low levels of noise contamination, regardless of the number of cameras, the two methods perform comparably. The situation changes when noisy observations or as few as two cameras with short baselines are employed. In these cases, the proposed method is more accurate than that of [12]. Thus, the proposed method is preferable in real-world scenarios with noisy observations obtained from easy-to-deploy, stereo camera setups.\",\"PeriodicalId\":299563,\"journal\":{\"name\":\"2013 International Symposium on Ubiquitous Virtual Reality\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Symposium on Ubiquitous Virtual Reality\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISUVR.2013.13\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Symposium on Ubiquitous Virtual Reality","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISUVR.2013.13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

我们提出了一种方法，铰接式手跟踪，依赖于视觉输入的校准多摄像机系统获得。关于这个问题的最新研究结果已在bb2010中提出。在这项工作中，手部跟踪被表述为目标函数的最小化，该函数量化了手部姿势假设与观察之间的差异。目标函数以独立的方式处理来自每个摄像机视图的观测结果。我们遵循相同的通用优化框架，但我们选择使用视觉船体[10]作为主要观察线索，这是在优化之前整合所有可用视图信息的结果。我们在大量的实验中研究了所得方法的行为，并与[12]的方法进行了比较。所得结果表明，对于低水平的噪声污染，无论相机的数量，这两种方法的性能相当。当使用有噪声的观测或只有两台短基线的摄像机时，情况就会发生变化。在这些情况下，所提出的方法比[12]的方法更精确。因此，所提出的方法在现实世界的场景中是优选的，从易于部署的立体相机设置中获得噪声观测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Tracking Hand Articulations: Relying on 3D Visual Hulls Versus Relying on Multiple 2D Cues

We present a method for articulated hand tracking that relies on visual input acquired by a calibrated multicamera system. A state-of-the-art result on this problem has been presented in [12]. In that work, hand tracking is formulated as the minimization of an objective function that quantifies the discrepancy between a hand pose hypothesis and the observations. The objective function treats the observations from each camera view in an independent way. We follow the same general optimization framework but we choose to employ the visual hull [10] as the main observation cue, which results from the integration of information from all available views prior to optimization. We investigate the behavior of the resulting method in extensive experiments and in comparison with that of [12]. The obtained results demonstrate that for low levels of noise contamination, regardless of the number of cameras, the two methods perform comparably. The situation changes when noisy observations or as few as two cameras with short baselines are employed. In these cases, the proposed method is more accurate than that of [12]. Thus, the proposed method is preferable in real-world scenarios with noisy observations obtained from easy-to-deploy, stereo camera setups.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 International Symposium on Ubiquitous Virtual Reality

自引率

0.00%

发文量