Gesture Enhanced Comprehension of Ambiguous Human-to-Robot Instructions

Proceedings of the 2020 International Conference on Multimodal Interaction Pub Date : 2020-10-21 DOI:10.1145/3382507.3418863

Dulanga Weerakoon, Vigneshwaran Subbaraju, Nipuni Karumpulli, Tuan Tran, Qianli Xu, U-Xuan Tan, Joo-Hwee Lim, Archan Misra

引用次数: 6

Abstract

This work demonstrates the feasibility and benefits of using pointing gestures, a naturally-generated additional input modality, to improve the multi-modal comprehension accuracy of human instructions to robotic agents for collaborative tasks.We present M2Gestic, a system that combines neural-based text parsing with a novel knowledge-graph traversal mechanism, over a multi-modal input of vision, natural language text and pointing. Via multiple studies related to a benchmark table top manipulation task, we show that (a) M2Gestic can achieve close-to-human performance in reasoning over unambiguous verbal instructions, and (b) incorporating pointing input (even with its inherent location uncertainty) in M2Gestic results in a significant (30%) accuracy improvement when verbal instructions are ambiguous.

查看原文本刊更多论文

手势增强对模糊人机指令的理解

这项工作证明了使用指向手势的可行性和好处，这是一种自然产生的额外输入方式，可以提高人类对机器人代理进行协作任务的指令的多模态理解精度。我们提出了M2Gestic系统，它结合了基于神经的文本解析和一种新的知识图遍历机制，通过视觉、自然语言文本和指向的多模态输入。通过与基准桌面操作任务相关的多项研究，我们表明(a) M2Gestic可以在无歧义的口头指令上实现接近人类的推理性能，(b)在M2Gestic中结合指向输入(即使具有固有的位置不确定性)，当口头指令含糊不清时，准确性显著提高(30%)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2020 International Conference on Multimodal Interaction

自引率

0.00%

发文量