On the Multisensory Nature of Objects and Language: A Robotics Perspective

1st International Workshop on Multimodal Understanding and Learning for Embodied Applications Pub Date : 2019-10-15 DOI:10.1145/3347450.3357658

J. Sinapov

{"title":"On the Multisensory Nature of Objects and Language: A Robotics Perspective","authors":"J. Sinapov","doi":"10.1145/3347450.3357658","DOIUrl":null,"url":null,"abstract":"Infants use exploratory behaviors to learn about the objects around them. Psychologists have theorized that behaviors such as grasping touching, pressing, and lifting, coupled with the visual, tactile, haptic and auditory sensory modalities, enable infants to form grounded object representations. For example, scratching an object can provide information about its roughness, while lifting it can provide information about its weight. In a sense, the exploratory behavior acts as a \"question'' to the object, which is subsequently \"answered\" by the sensory stimuli produced during the execution of the behavior. In contrast, most object representations used by robots today rely solely on computer vision or laser scan data, gathered through passive observation. Such disembodied approaches to robotic perception may be useful for recognizing an object using a 3D model database, but nevertheless, will fail to infer object properties that cannot be detected using vision alone. To bridge this gap, our research has pursued a developmental framework for object perception and exploration in which the robot's representation of objects is grounded in its own sensorimotor experience with them \\citesinapov2014grounding. In this framework, an object is represented by sensorimotor contingencies that span a diverse set of exploratory behaviors and sensory modalities. In this talk, I will highlight results from several large-scale experimental studies which show that the behavior-grounded object representation enables a robot to solve a wide variety of perceptual and cognitive tasks relevant to object learning \\citesinapov2014learning,sinapov2011interactive. I will discuss recent work on how robots can ground language in multisensory experience with objects \\citethomason2016learning and will conclude with a discussion on open problems in multisensory symbol grounding, which, if solved, could result in the large-scale deployment of robotic systems in real-world domains.","PeriodicalId":329495,"journal":{"name":"1st International Workshop on Multimodal Understanding and Learning for Embodied Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"1st International Workshop on Multimodal Understanding and Learning for Embodied Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3347450.3357658","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Infants use exploratory behaviors to learn about the objects around them. Psychologists have theorized that behaviors such as grasping touching, pressing, and lifting, coupled with the visual, tactile, haptic and auditory sensory modalities, enable infants to form grounded object representations. For example, scratching an object can provide information about its roughness, while lifting it can provide information about its weight. In a sense, the exploratory behavior acts as a "question'' to the object, which is subsequently "answered" by the sensory stimuli produced during the execution of the behavior. In contrast, most object representations used by robots today rely solely on computer vision or laser scan data, gathered through passive observation. Such disembodied approaches to robotic perception may be useful for recognizing an object using a 3D model database, but nevertheless, will fail to infer object properties that cannot be detected using vision alone. To bridge this gap, our research has pursued a developmental framework for object perception and exploration in which the robot's representation of objects is grounded in its own sensorimotor experience with them \citesinapov2014grounding. In this framework, an object is represented by sensorimotor contingencies that span a diverse set of exploratory behaviors and sensory modalities. In this talk, I will highlight results from several large-scale experimental studies which show that the behavior-grounded object representation enables a robot to solve a wide variety of perceptual and cognitive tasks relevant to object learning \citesinapov2014learning,sinapov2011interactive. I will discuss recent work on how robots can ground language in multisensory experience with objects \citethomason2016learning and will conclude with a discussion on open problems in multisensory symbol grounding, which, if solved, could result in the large-scale deployment of robotic systems in real-world domains.

查看原文本刊更多论文

论物体和语言的多感官本质:机器人的视角

婴儿使用探索性行为来了解他们周围的物体。心理学家的理论认为，诸如抓、触、压、举等行为，加上视觉、触觉、触觉和听觉的感觉模式，使婴儿能够形成基于物体的表征。例如，抓一个物体可以提供关于它的粗糙度的信息，而举起它可以提供关于它的重量的信息。从某种意义上说，探索性行为是对客体的一个“问题”，随后由行为执行过程中产生的感官刺激“回答”。相比之下，今天机器人使用的大多数物体表示仅依赖于计算机视觉或激光扫描数据，通过被动观察收集。这种无实体的机器人感知方法可能对使用3D模型数据库识别物体有用，但尽管如此，它无法推断出仅使用视觉无法检测到的物体属性。为了弥补这一差距，我们的研究追求了一个物体感知和探索的发展框架，在这个框架中，机器人对物体的表征是建立在它自己的感觉运动经验基础上的。在这个框架中，一个对象由感觉运动偶然性来表征，这些偶然性跨越了一系列不同的探索行为和感觉模式。在这次演讲中，我将重点介绍几项大规模实验研究的结果，这些研究表明，基于行为的对象表示使机器人能够解决与对象学习相关的各种感知和认知任务。我将讨论最近关于机器人如何在物体的多感官体验中接地语言的工作，并将以多感官符号接地中的开放问题的讨论结束，如果解决了这些问题，可能会导致机器人系统在现实世界领域的大规模部署。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

1st International Workshop on Multimodal Understanding and Learning for Embodied Applications

自引率

0.00%

发文量