基于RNN驱动的多模态交互平台的开发

Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents Pub Date : 2019-07-01 DOI:10.1145/3308532.3329448

Hung-Hsuan Huang, Masato Fukuda, T. Nishida

{"title":"基于RNN驱动的多模态交互平台的开发","authors":"Hung-Hsuan Huang, Masato Fukuda, T. Nishida","doi":"10.1145/3308532.3329448","DOIUrl":null,"url":null,"abstract":"This paper describes our ongoing project to build a platform that enables real-time multimodal interaction with embodied conversational agents. All of the components are in modular design and can be switched to other models easily. A prototype listener agent has been developed upon the platform. Its spontaneous reactive behaviors are trained from a multimodal data corpus collected in a human-human conversation experiment. Two Gated Recurrent Unit (GRU) based models are switched when the agent is speaking or is not speaking. These models generate the agent's facial expressions, head movements, and postures from the corresponding behaviors of the human user in real-time. Benefits from the flexible design, the utterance generation part can be an autonomous dialogue manager with hand crafted rules, an on-line chatbot engine, or a human operator.","PeriodicalId":112642,"journal":{"name":"Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Development of a Platform for RNN Driven Multimodal Interaction with Embodied Conversational Agents\",\"authors\":\"Hung-Hsuan Huang, Masato Fukuda, T. Nishida\",\"doi\":\"10.1145/3308532.3329448\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes our ongoing project to build a platform that enables real-time multimodal interaction with embodied conversational agents. All of the components are in modular design and can be switched to other models easily. A prototype listener agent has been developed upon the platform. Its spontaneous reactive behaviors are trained from a multimodal data corpus collected in a human-human conversation experiment. Two Gated Recurrent Unit (GRU) based models are switched when the agent is speaking or is not speaking. These models generate the agent's facial expressions, head movements, and postures from the corresponding behaviors of the human user in real-time. Benefits from the flexible design, the utterance generation part can be an autonomous dialogue manager with hand crafted rules, an on-line chatbot engine, or a human operator.\",\"PeriodicalId\":112642,\"journal\":{\"name\":\"Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3308532.3329448\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3308532.3329448","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

本文描述了我们正在进行的一个项目，该项目旨在建立一个平台，使其能够与具体化的会话代理进行实时多模式交互。所有组件均采用模块化设计，可轻松切换到其他型号。在该平台上开发了一个原型侦听器代理。它的自发反应行为是通过在一个人类对话实验中收集的多模态数据语料库来训练的。两个基于门控循环单元(GRU)的模型在智能体说话或不说话时切换。这些模型根据人类用户的相应行为实时生成智能体的面部表情、头部动作和姿势。得益于灵活的设计，话语生成部分可以是具有手工制作规则的自主对话管理器、在线聊天机器人引擎或人工操作员。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Development of a Platform for RNN Driven Multimodal Interaction with Embodied Conversational Agents

This paper describes our ongoing project to build a platform that enables real-time multimodal interaction with embodied conversational agents. All of the components are in modular design and can be switched to other models easily. A prototype listener agent has been developed upon the platform. Its spontaneous reactive behaviors are trained from a multimodal data corpus collected in a human-human conversation experiment. Two Gated Recurrent Unit (GRU) based models are switched when the agent is speaking or is not speaking. These models generate the agent's facial expressions, head movements, and postures from the corresponding behaviors of the human user in real-time. Benefits from the flexible design, the utterance generation part can be an autonomous dialogue manager with hand crafted rules, an on-line chatbot engine, or a human operator.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents

自引率

0.00%

发文量