{"title":"On the Multisensory Nature of Objects and Language: A Robotics Perspective","authors":"J. Sinapov","doi":"10.1145/3347450.3357658","DOIUrl":null,"url":null,"abstract":"Infants use exploratory behaviors to learn about the objects around them. Psychologists have theorized that behaviors such as grasping touching, pressing, and lifting, coupled with the visual, tactile, haptic and auditory sensory modalities, enable infants to form grounded object representations. For example, scratching an object can provide information about its roughness, while lifting it can provide information about its weight. In a sense, the exploratory behavior acts as a \"question'' to the object, which is subsequently \"answered\" by the sensory stimuli produced during the execution of the behavior. In contrast, most object representations used by robots today rely solely on computer vision or laser scan data, gathered through passive observation. Such disembodied approaches to robotic perception may be useful for recognizing an object using a 3D model database, but nevertheless, will fail to infer object properties that cannot be detected using vision alone. To bridge this gap, our research has pursued a developmental framework for object perception and exploration in which the robot's representation of objects is grounded in its own sensorimotor experience with them \\citesinapov2014grounding. In this framework, an object is represented by sensorimotor contingencies that span a diverse set of exploratory behaviors and sensory modalities. In this talk, I will highlight results from several large-scale experimental studies which show that the behavior-grounded object representation enables a robot to solve a wide variety of perceptual and cognitive tasks relevant to object learning \\citesinapov2014learning,sinapov2011interactive. I will discuss recent work on how robots can ground language in multisensory experience with objects \\citethomason2016learning and will conclude with a discussion on open problems in multisensory symbol grounding, which, if solved, could result in the large-scale deployment of robotic systems in real-world domains.","PeriodicalId":329495,"journal":{"name":"1st International Workshop on Multimodal Understanding and Learning for Embodied Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"1st International Workshop on Multimodal Understanding and Learning for Embodied Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3347450.3357658","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Infants use exploratory behaviors to learn about the objects around them. Psychologists have theorized that behaviors such as grasping touching, pressing, and lifting, coupled with the visual, tactile, haptic and auditory sensory modalities, enable infants to form grounded object representations. For example, scratching an object can provide information about its roughness, while lifting it can provide information about its weight. In a sense, the exploratory behavior acts as a "question'' to the object, which is subsequently "answered" by the sensory stimuli produced during the execution of the behavior. In contrast, most object representations used by robots today rely solely on computer vision or laser scan data, gathered through passive observation. Such disembodied approaches to robotic perception may be useful for recognizing an object using a 3D model database, but nevertheless, will fail to infer object properties that cannot be detected using vision alone. To bridge this gap, our research has pursued a developmental framework for object perception and exploration in which the robot's representation of objects is grounded in its own sensorimotor experience with them \citesinapov2014grounding. In this framework, an object is represented by sensorimotor contingencies that span a diverse set of exploratory behaviors and sensory modalities. In this talk, I will highlight results from several large-scale experimental studies which show that the behavior-grounded object representation enables a robot to solve a wide variety of perceptual and cognitive tasks relevant to object learning \citesinapov2014learning,sinapov2011interactive. I will discuss recent work on how robots can ground language in multisensory experience with objects \citethomason2016learning and will conclude with a discussion on open problems in multisensory symbol grounding, which, if solved, could result in the large-scale deployment of robotic systems in real-world domains.