{"title":"基于多模态融合框架的人机协作指令解释","authors":"Jonathan Cacace, Alberto Finzi, V. Lippiello","doi":"10.1109/ROMAN.2017.8172329","DOIUrl":null,"url":null,"abstract":"We present a novel multimodal interaction framework supporting robust human-robot communication. We consider a scenario where a human operator can exploit multiple communication channels to interact with one or more robots in order to accomplish shared tasks. Moreover, we assume that the human is not fully dedicated to the robot control, but also involved in other activities, hence only able to interact with the robotic system in a sparse and incomplete manner. In this context, several human or environmental factors could cause errors, noise and wrong interpretations of the commands. The main goal of this work is to improve the robustness of humanrobot interaction systems in similar situations. In particular, we propose a multimodal fusion method based on the following steps: for each communication channel, unimodal classifiers are firstly deployed in order to generate unimodal interpretations of the human inputs; the unimodal outcomes are then grouped into different multimodal recognition lines, each representing a possible interpretation of a sequence of multimodal inputs; these lines are finally assessed in order to recognize the human commands. We discuss the system at work in a case study in which a human rescuer interacts with a team of flying robots during Search & Rescue missions. In this scenario, we present and discuss real world experiments to demonstrate the effectiveness of the proposed framework.","PeriodicalId":134777,"journal":{"name":"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"A robust multimodal fusion framework for command interpretation in human-robot cooperation\",\"authors\":\"Jonathan Cacace, Alberto Finzi, V. Lippiello\",\"doi\":\"10.1109/ROMAN.2017.8172329\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a novel multimodal interaction framework supporting robust human-robot communication. We consider a scenario where a human operator can exploit multiple communication channels to interact with one or more robots in order to accomplish shared tasks. Moreover, we assume that the human is not fully dedicated to the robot control, but also involved in other activities, hence only able to interact with the robotic system in a sparse and incomplete manner. In this context, several human or environmental factors could cause errors, noise and wrong interpretations of the commands. The main goal of this work is to improve the robustness of humanrobot interaction systems in similar situations. In particular, we propose a multimodal fusion method based on the following steps: for each communication channel, unimodal classifiers are firstly deployed in order to generate unimodal interpretations of the human inputs; the unimodal outcomes are then grouped into different multimodal recognition lines, each representing a possible interpretation of a sequence of multimodal inputs; these lines are finally assessed in order to recognize the human commands. We discuss the system at work in a case study in which a human rescuer interacts with a team of flying robots during Search & Rescue missions. In this scenario, we present and discuss real world experiments to demonstrate the effectiveness of the proposed framework.\",\"PeriodicalId\":134777,\"journal\":{\"name\":\"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)\",\"volume\":\"99 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROMAN.2017.8172329\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROMAN.2017.8172329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A robust multimodal fusion framework for command interpretation in human-robot cooperation
We present a novel multimodal interaction framework supporting robust human-robot communication. We consider a scenario where a human operator can exploit multiple communication channels to interact with one or more robots in order to accomplish shared tasks. Moreover, we assume that the human is not fully dedicated to the robot control, but also involved in other activities, hence only able to interact with the robotic system in a sparse and incomplete manner. In this context, several human or environmental factors could cause errors, noise and wrong interpretations of the commands. The main goal of this work is to improve the robustness of humanrobot interaction systems in similar situations. In particular, we propose a multimodal fusion method based on the following steps: for each communication channel, unimodal classifiers are firstly deployed in order to generate unimodal interpretations of the human inputs; the unimodal outcomes are then grouped into different multimodal recognition lines, each representing a possible interpretation of a sequence of multimodal inputs; these lines are finally assessed in order to recognize the human commands. We discuss the system at work in a case study in which a human rescuer interacts with a team of flying robots during Search & Rescue missions. In this scenario, we present and discuss real world experiments to demonstrate the effectiveness of the proposed framework.