基于多模态融合框架的人机协作指令解释

2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) Pub Date : 2017-08-01 DOI:10.1109/ROMAN.2017.8172329

Jonathan Cacace, Alberto Finzi, V. Lippiello

{"title":"基于多模态融合框架的人机协作指令解释","authors":"Jonathan Cacace, Alberto Finzi, V. Lippiello","doi":"10.1109/ROMAN.2017.8172329","DOIUrl":null,"url":null,"abstract":"We present a novel multimodal interaction framework supporting robust human-robot communication. We consider a scenario where a human operator can exploit multiple communication channels to interact with one or more robots in order to accomplish shared tasks. Moreover, we assume that the human is not fully dedicated to the robot control, but also involved in other activities, hence only able to interact with the robotic system in a sparse and incomplete manner. In this context, several human or environmental factors could cause errors, noise and wrong interpretations of the commands. The main goal of this work is to improve the robustness of humanrobot interaction systems in similar situations. In particular, we propose a multimodal fusion method based on the following steps: for each communication channel, unimodal classifiers are firstly deployed in order to generate unimodal interpretations of the human inputs; the unimodal outcomes are then grouped into different multimodal recognition lines, each representing a possible interpretation of a sequence of multimodal inputs; these lines are finally assessed in order to recognize the human commands. We discuss the system at work in a case study in which a human rescuer interacts with a team of flying robots during Search & Rescue missions. In this scenario, we present and discuss real world experiments to demonstrate the effectiveness of the proposed framework.","PeriodicalId":134777,"journal":{"name":"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"A robust multimodal fusion framework for command interpretation in human-robot cooperation\",\"authors\":\"Jonathan Cacace, Alberto Finzi, V. Lippiello\",\"doi\":\"10.1109/ROMAN.2017.8172329\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a novel multimodal interaction framework supporting robust human-robot communication. We consider a scenario where a human operator can exploit multiple communication channels to interact with one or more robots in order to accomplish shared tasks. Moreover, we assume that the human is not fully dedicated to the robot control, but also involved in other activities, hence only able to interact with the robotic system in a sparse and incomplete manner. In this context, several human or environmental factors could cause errors, noise and wrong interpretations of the commands. The main goal of this work is to improve the robustness of humanrobot interaction systems in similar situations. In particular, we propose a multimodal fusion method based on the following steps: for each communication channel, unimodal classifiers are firstly deployed in order to generate unimodal interpretations of the human inputs; the unimodal outcomes are then grouped into different multimodal recognition lines, each representing a possible interpretation of a sequence of multimodal inputs; these lines are finally assessed in order to recognize the human commands. We discuss the system at work in a case study in which a human rescuer interacts with a team of flying robots during Search & Rescue missions. In this scenario, we present and discuss real world experiments to demonstrate the effectiveness of the proposed framework.\",\"PeriodicalId\":134777,\"journal\":{\"name\":\"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)\",\"volume\":\"99 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROMAN.2017.8172329\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROMAN.2017.8172329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

我们提出了一种新的多模态交互框架，支持鲁棒的人机通信。我们考虑这样一种场景，即人类操作员可以利用多个通信通道与一个或多个机器人进行交互，以完成共享任务。此外，我们假设人类并不完全致力于机器人的控制，而是参与其他活动，因此只能以稀疏和不完整的方式与机器人系统进行交互。在这种情况下，一些人为或环境因素可能导致错误、噪音和对命令的错误解释。这项工作的主要目标是提高在类似情况下人机交互系统的鲁棒性。特别地，我们提出了一种基于以下步骤的多模态融合方法:对于每个通信通道，首先部署单模态分类器，以生成人类输入的单模态解释;然后将单模态结果分组到不同的多模态识别线中，每个识别线代表对一系列多模态输入的可能解释;最后对这些行进行评估，以便识别人类的命令。我们在一个案例研究中讨论了该系统在工作中的作用，在这个案例研究中，一名人类救援人员在搜索和救援任务中与一组飞行机器人互动。在这种情况下，我们提出并讨论了真实世界的实验，以证明所提出框架的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A robust multimodal fusion framework for command interpretation in human-robot cooperation

We present a novel multimodal interaction framework supporting robust human-robot communication. We consider a scenario where a human operator can exploit multiple communication channels to interact with one or more robots in order to accomplish shared tasks. Moreover, we assume that the human is not fully dedicated to the robot control, but also involved in other activities, hence only able to interact with the robotic system in a sparse and incomplete manner. In this context, several human or environmental factors could cause errors, noise and wrong interpretations of the commands. The main goal of this work is to improve the robustness of humanrobot interaction systems in similar situations. In particular, we propose a multimodal fusion method based on the following steps: for each communication channel, unimodal classifiers are firstly deployed in order to generate unimodal interpretations of the human inputs; the unimodal outcomes are then grouped into different multimodal recognition lines, each representing a possible interpretation of a sequence of multimodal inputs; these lines are finally assessed in order to recognize the human commands. We discuss the system at work in a case study in which a human rescuer interacts with a team of flying robots during Search & Rescue missions. In this scenario, we present and discuss real world experiments to demonstrate the effectiveness of the proposed framework.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)

自引率

0.00%

发文量