从亚符号多模态感知中学习语义成分

2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL) Pub Date : 2013-08-18 DOI:10.1109/DEVLRN.2013.6652563

Olivier Mangin, Pierre-Yves Oudeyer

{"title":"从亚符号多模态感知中学习语义成分","authors":"Olivier Mangin, Pierre-Yves Oudeyer","doi":"10.1109/DEVLRN.2013.6652563","DOIUrl":null,"url":null,"abstract":"Perceptual systems often include sensors from several modalities. However, existing robots do not yet sufficiently discover patterns that are spread over the flow of multimodal data they receive. In this paper we present a framework that learns a dictionary of words from full spoken utterances, together with a set of gestures from human demonstrations and the semantic connection between words and gestures. We explain how to use a nonnegative matrix factorization algorithm to learn a dictionary of components that represent meaningful elements present in the multimodal perception, without providing the system with a symbolic representation of the semantics. We illustrate this framework by showing how a learner discovers word-like components from observation of gestures made by a human together with spoken descriptions of the gestures, and how it captures the semantic association between the two.","PeriodicalId":106997,"journal":{"name":"2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL)","volume":"19 12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":"{\"title\":\"Learning semantic components from subsymbolic multimodal perception\",\"authors\":\"Olivier Mangin, Pierre-Yves Oudeyer\",\"doi\":\"10.1109/DEVLRN.2013.6652563\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Perceptual systems often include sensors from several modalities. However, existing robots do not yet sufficiently discover patterns that are spread over the flow of multimodal data they receive. In this paper we present a framework that learns a dictionary of words from full spoken utterances, together with a set of gestures from human demonstrations and the semantic connection between words and gestures. We explain how to use a nonnegative matrix factorization algorithm to learn a dictionary of components that represent meaningful elements present in the multimodal perception, without providing the system with a symbolic representation of the semantics. We illustrate this framework by showing how a learner discovers word-like components from observation of gestures made by a human together with spoken descriptions of the gestures, and how it captures the semantic association between the two.\",\"PeriodicalId\":106997,\"journal\":{\"name\":\"2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL)\",\"volume\":\"19 12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"29\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DEVLRN.2013.6652563\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEVLRN.2013.6652563","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 29

摘要

感知系统通常包括来自不同模态的传感器。然而，现有的机器人还不能充分发现分布在它们接收到的多模态数据流中的模式。在本文中，我们提出了一个框架，从完整的口头话语中学习单词字典，从人类演示中学习一组手势，以及单词和手势之间的语义联系。我们解释了如何使用非负矩阵分解算法来学习表示多模态感知中存在的有意义元素的组件字典，而不向系统提供语义的符号表示。我们通过展示学习者如何通过观察人类的手势以及对手势的口头描述来发现类词成分，以及它如何捕捉两者之间的语义关联来说明这个框架。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning semantic components from subsymbolic multimodal perception

Perceptual systems often include sensors from several modalities. However, existing robots do not yet sufficiently discover patterns that are spread over the flow of multimodal data they receive. In this paper we present a framework that learns a dictionary of words from full spoken utterances, together with a set of gestures from human demonstrations and the semantic connection between words and gestures. We explain how to use a nonnegative matrix factorization algorithm to learn a dictionary of components that represent meaningful elements present in the multimodal perception, without providing the system with a symbolic representation of the semantics. We illustrate this framework by showing how a learner discovers word-like components from observation of gestures made by a human together with spoken descriptions of the gestures, and how it captures the semantic association between the two.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL)

自引率

0.00%

发文量