ARnnotate: An Augmented Reality Interface for Collecting Custom Dataset of 3D Hand-Object Interaction Pose Estimation

Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology Pub Date : 2022-10-28 DOI:10.1145/3526113.3545663

Xun Qian, F. He, Xiyun Hu, Tianyi Wang, K. Ramani

{"title":"ARnnotate: An Augmented Reality Interface for Collecting Custom Dataset of 3D Hand-Object Interaction Pose Estimation","authors":"Xun Qian, F. He, Xiyun Hu, Tianyi Wang, K. Ramani","doi":"10.1145/3526113.3545663","DOIUrl":null,"url":null,"abstract":"Vision-based 3D pose estimation has substantial potential in hand-object interaction applications and requires user-specified datasets to achieve robust performance. We propose ARnnotate, an Augmented Reality (AR) interface enabling end-users to create custom data using a hand-tracking-capable AR device. Unlike other dataset collection strategies, ARnnotate first guides a user to manipulate a virtual bounding box and records its poses and the user’s hand joint positions as the labels. By leveraging the spatial awareness of AR, the user manipulates the corresponding physical object while following the in-situ AR animation of the bounding box and hand model, while ARnnotate captures the user’s first-person view as the images of the dataset. A 12-participant user study was conducted, and the results proved the system’s usability in terms of the spatial accuracy of the labels, the satisfactory performance of the deep neural networks trained with the data collected by ARnnotate, and the users’ subjective feedback.","PeriodicalId":200048,"journal":{"name":"Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3526113.3545663","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Vision-based 3D pose estimation has substantial potential in hand-object interaction applications and requires user-specified datasets to achieve robust performance. We propose ARnnotate, an Augmented Reality (AR) interface enabling end-users to create custom data using a hand-tracking-capable AR device. Unlike other dataset collection strategies, ARnnotate first guides a user to manipulate a virtual bounding box and records its poses and the user’s hand joint positions as the labels. By leveraging the spatial awareness of AR, the user manipulates the corresponding physical object while following the in-situ AR animation of the bounding box and hand model, while ARnnotate captures the user’s first-person view as the images of the dataset. A 12-participant user study was conducted, and the results proved the system’s usability in terms of the spatial accuracy of the labels, the satisfactory performance of the deep neural networks trained with the data collected by ARnnotate, and the users’ subjective feedback.

查看原文本刊更多论文

一个增强现实界面，用于收集3D手-对象交互姿态估计的自定义数据集

基于视觉的三维姿态估计在手-物交互应用中具有巨大的潜力，并且需要用户指定的数据集来实现稳健的性能。我们提出了ARnnotate，一个增强现实(AR)接口，使最终用户能够使用具有手动跟踪功能的AR设备创建自定义数据。与其他数据集收集策略不同，ARnnotate首先引导用户操作虚拟边界框，并将其姿势和用户的手关节位置记录为标签。通过利用AR的空间感知，用户可以在跟随边界框和手模型的原位AR动画的同时操作相应的物理对象，而ARnnotate则将用户的第一人称视角捕获为数据集的图像。通过对12名参与者的用户研究，结果证明了该系统在标签的空间准确性、使用ARnnotate收集的数据训练的深度神经网络的令人满意的性能以及用户的主观反馈方面的可用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology

自引率

0.00%

发文量