Toon Van de Maele;Tim Verbelen;Pietro Mazzaglia;Stefano Ferraro;Bart Dhoedt
{"title":"使用主动推理进行以对象为中心的场景表示","authors":"Toon Van de Maele;Tim Verbelen;Pietro Mazzaglia;Stefano Ferraro;Bart Dhoedt","doi":"10.1162/neco_a_01637","DOIUrl":null,"url":null,"abstract":"Representing a scene and its constituent objects from raw sensory data is a core ability for enabling robots to interact with their environment. In this letter, we propose a novel approach for scene understanding, leveraging an object-centric generative model that enables an agent to infer object category and pose in an allocentric reference frame using active inference, a neuro-inspired framework for action and perception. For evaluating the behavior of an active vision agent, we also propose a new benchmark where, given a target viewpoint of a particular object, the agent needs to find the best matching viewpoint given a workspace with randomly positioned objects in 3D. We demonstrate that our active inference agent is able to balance epistemic foraging and goal-driven behavior, and quantitatively outperforms both supervised and reinforcement learning baselines by more than a factor of two in terms of success rate.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 4","pages":"677-704"},"PeriodicalIF":2.7000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Object-Centric Scene Representations Using Active Inference\",\"authors\":\"Toon Van de Maele;Tim Verbelen;Pietro Mazzaglia;Stefano Ferraro;Bart Dhoedt\",\"doi\":\"10.1162/neco_a_01637\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Representing a scene and its constituent objects from raw sensory data is a core ability for enabling robots to interact with their environment. In this letter, we propose a novel approach for scene understanding, leveraging an object-centric generative model that enables an agent to infer object category and pose in an allocentric reference frame using active inference, a neuro-inspired framework for action and perception. For evaluating the behavior of an active vision agent, we also propose a new benchmark where, given a target viewpoint of a particular object, the agent needs to find the best matching viewpoint given a workspace with randomly positioned objects in 3D. We demonstrate that our active inference agent is able to balance epistemic foraging and goal-driven behavior, and quantitatively outperforms both supervised and reinforcement learning baselines by more than a factor of two in terms of success rate.\",\"PeriodicalId\":54731,\"journal\":{\"name\":\"Neural Computation\",\"volume\":\"36 4\",\"pages\":\"677-704\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-03-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Computation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10535076/\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computation","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10535076/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Object-Centric Scene Representations Using Active Inference
Representing a scene and its constituent objects from raw sensory data is a core ability for enabling robots to interact with their environment. In this letter, we propose a novel approach for scene understanding, leveraging an object-centric generative model that enables an agent to infer object category and pose in an allocentric reference frame using active inference, a neuro-inspired framework for action and perception. For evaluating the behavior of an active vision agent, we also propose a new benchmark where, given a target viewpoint of a particular object, the agent needs to find the best matching viewpoint given a workspace with randomly positioned objects in 3D. We demonstrate that our active inference agent is able to balance epistemic foraging and goal-driven behavior, and quantitatively outperforms both supervised and reinforcement learning baselines by more than a factor of two in terms of success rate.
期刊介绍:
Neural Computation is uniquely positioned at the crossroads between neuroscience and TMCS and welcomes the submission of original papers from all areas of TMCS, including: Advanced experimental design; Analysis of chemical sensor data; Connectomic reconstructions; Analysis of multielectrode and optical recordings; Genetic data for cell identity; Analysis of behavioral data; Multiscale models; Analysis of molecular mechanisms; Neuroinformatics; Analysis of brain imaging data; Neuromorphic engineering; Principles of neural coding, computation, circuit dynamics, and plasticity; Theories of brain function.