{"title":"多模态人机通信的鲁棒语音理解","authors":"S. Hüwel, B. Wrede, G. Sagerer","doi":"10.1109/ROMAN.2006.314393","DOIUrl":null,"url":null,"abstract":"In order to model complex human robot interaction researchers not only have to consider different tasks but also to handle the complex interplay of different modules of one single robot system. In our context we constructed a robot assistant integrated in a home or office environment. We allow for a fairly natural communication style, which means that the users communicate using speech but are also allowed to use gestures and moreover to use contextual scene knowledge. Against this background, this paper presents a robust speech understanding component for situated human-robot communication. It serves as interface between speech recognition and dialog management. To increase robustness of speech processing it rates the speech recognition output by means of semantic coherence. Even if the recognized word-stream is not grammatically correct the speech understanding component provides semantic interpretations in context of multi-modal input for dialog management. For the understanding process, we designed special semantic concepts grounded to the domain of situated communication. They also provide additional information about the dialog act. A processing mechanism uses these concept units to generate the most likely semantic interpretation of the utterances","PeriodicalId":254129,"journal":{"name":"ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":"{\"title\":\"Robust Speech Understanding for Multi-Modal Human-Robot Communication\",\"authors\":\"S. Hüwel, B. Wrede, G. Sagerer\",\"doi\":\"10.1109/ROMAN.2006.314393\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to model complex human robot interaction researchers not only have to consider different tasks but also to handle the complex interplay of different modules of one single robot system. In our context we constructed a robot assistant integrated in a home or office environment. We allow for a fairly natural communication style, which means that the users communicate using speech but are also allowed to use gestures and moreover to use contextual scene knowledge. Against this background, this paper presents a robust speech understanding component for situated human-robot communication. It serves as interface between speech recognition and dialog management. To increase robustness of speech processing it rates the speech recognition output by means of semantic coherence. Even if the recognized word-stream is not grammatically correct the speech understanding component provides semantic interpretations in context of multi-modal input for dialog management. For the understanding process, we designed special semantic concepts grounded to the domain of situated communication. They also provide additional information about the dialog act. A processing mechanism uses these concept units to generate the most likely semantic interpretation of the utterances\",\"PeriodicalId\":254129,\"journal\":{\"name\":\"ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"29\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROMAN.2006.314393\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROMAN.2006.314393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Robust Speech Understanding for Multi-Modal Human-Robot Communication
In order to model complex human robot interaction researchers not only have to consider different tasks but also to handle the complex interplay of different modules of one single robot system. In our context we constructed a robot assistant integrated in a home or office environment. We allow for a fairly natural communication style, which means that the users communicate using speech but are also allowed to use gestures and moreover to use contextual scene knowledge. Against this background, this paper presents a robust speech understanding component for situated human-robot communication. It serves as interface between speech recognition and dialog management. To increase robustness of speech processing it rates the speech recognition output by means of semantic coherence. Even if the recognized word-stream is not grammatically correct the speech understanding component provides semantic interpretations in context of multi-modal input for dialog management. For the understanding process, we designed special semantic concepts grounded to the domain of situated communication. They also provide additional information about the dialog act. A processing mechanism uses these concept units to generate the most likely semantic interpretation of the utterances