多模态人机通信的鲁棒语音理解

ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication Pub Date : 2006-09-01 DOI:10.1109/ROMAN.2006.314393

S. Hüwel, B. Wrede, G. Sagerer

{"title":"多模态人机通信的鲁棒语音理解","authors":"S. Hüwel, B. Wrede, G. Sagerer","doi":"10.1109/ROMAN.2006.314393","DOIUrl":null,"url":null,"abstract":"In order to model complex human robot interaction researchers not only have to consider different tasks but also to handle the complex interplay of different modules of one single robot system. In our context we constructed a robot assistant integrated in a home or office environment. We allow for a fairly natural communication style, which means that the users communicate using speech but are also allowed to use gestures and moreover to use contextual scene knowledge. Against this background, this paper presents a robust speech understanding component for situated human-robot communication. It serves as interface between speech recognition and dialog management. To increase robustness of speech processing it rates the speech recognition output by means of semantic coherence. Even if the recognized word-stream is not grammatically correct the speech understanding component provides semantic interpretations in context of multi-modal input for dialog management. For the understanding process, we designed special semantic concepts grounded to the domain of situated communication. They also provide additional information about the dialog act. A processing mechanism uses these concept units to generate the most likely semantic interpretation of the utterances","PeriodicalId":254129,"journal":{"name":"ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":"{\"title\":\"Robust Speech Understanding for Multi-Modal Human-Robot Communication\",\"authors\":\"S. Hüwel, B. Wrede, G. Sagerer\",\"doi\":\"10.1109/ROMAN.2006.314393\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to model complex human robot interaction researchers not only have to consider different tasks but also to handle the complex interplay of different modules of one single robot system. In our context we constructed a robot assistant integrated in a home or office environment. We allow for a fairly natural communication style, which means that the users communicate using speech but are also allowed to use gestures and moreover to use contextual scene knowledge. Against this background, this paper presents a robust speech understanding component for situated human-robot communication. It serves as interface between speech recognition and dialog management. To increase robustness of speech processing it rates the speech recognition output by means of semantic coherence. Even if the recognized word-stream is not grammatically correct the speech understanding component provides semantic interpretations in context of multi-modal input for dialog management. For the understanding process, we designed special semantic concepts grounded to the domain of situated communication. They also provide additional information about the dialog act. A processing mechanism uses these concept units to generate the most likely semantic interpretation of the utterances\",\"PeriodicalId\":254129,\"journal\":{\"name\":\"ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"29\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROMAN.2006.314393\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROMAN.2006.314393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 29

摘要

为了建立复杂的人机交互模型，研究人员不仅要考虑不同的任务，而且要处理单个机器人系统中不同模块的复杂相互作用。在我们的背景下，我们构建了一个集成在家庭或办公室环境中的机器人助手。我们允许一种相当自然的交流方式，这意味着用户使用语音进行交流，但也允许使用手势和上下文场景知识。在此背景下，本文提出了一种用于情境人机通信的鲁棒语音理解组件。它是语音识别和对话管理之间的接口。为了提高语音处理的鲁棒性，采用语义连贯的方法对语音识别输出进行评价。即使被识别的词流在语法上不正确，语音理解组件也会在对话管理的多模态输入上下文中提供语义解释。在理解过程中，我们设计了基于情境交际领域的特殊语义概念。它们还提供了关于对话行为的额外信息。一种处理机制利用这些概念单元来生成最可能的话语语义解释

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Robust Speech Understanding for Multi-Modal Human-Robot Communication

In order to model complex human robot interaction researchers not only have to consider different tasks but also to handle the complex interplay of different modules of one single robot system. In our context we constructed a robot assistant integrated in a home or office environment. We allow for a fairly natural communication style, which means that the users communicate using speech but are also allowed to use gestures and moreover to use contextual scene knowledge. Against this background, this paper presents a robust speech understanding component for situated human-robot communication. It serves as interface between speech recognition and dialog management. To increase robustness of speech processing it rates the speech recognition output by means of semantic coherence. Even if the recognized word-stream is not grammatically correct the speech understanding component provides semantic interpretations in context of multi-modal input for dialog management. For the understanding process, we designed special semantic concepts grounded to the domain of situated communication. They also provide additional information about the dialog act. A processing mechanism uses these concept units to generate the most likely semantic interpretation of the utterances

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication

自引率

0.00%

发文量