{"title":"软机器人预测身体感知:融合多模态感官数据的贝叶斯变分自编码器","authors":"Shuyu Wang;Dongling Liu;Changzeng Fu;Xiaoming Yuan;Peng Shan;Victor C.M. Leung","doi":"10.1109/TRO.2025.3610170","DOIUrl":null,"url":null,"abstract":"Predicting the causal flow by fusing multimodal perception is fundamental for constructing the bodily awareness of soft robots. However, forming such a predictive model while fusing the multimodal sensory data of soft robots remains challenging and less explored. In this study, we leverage the free energy principle within a Bayesian probabilistic deep learning framework to merge visual, pressure, and flex sensing signals. Our proposed multimodal association mechanism enhances the fusion process, establishing a robust computational methodology. We train the model using a newly collected dataset that captures the grasping dynamics of a soft gripper equipped with multimodal perception capabilities. By incorporating the current state and image differences, the forward model can predict the soft gripper’s physical interaction and movement in the image flow, which amounts to imagining future motion events. Moreover, we showcase effective predictions across modalities as well as for grasping outcomes. Notably, our enhanced variational autoencoder approach can pave the way for unprecedented possibilities of bodily awareness in soft robotics.","PeriodicalId":50388,"journal":{"name":"IEEE Transactions on Robotics","volume":"41 ","pages":"5663-5678"},"PeriodicalIF":10.5000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predictive Body Awareness in Soft Robots: A Bayesian Variational Autoencoder Fusing Multimodal Sensory Data\",\"authors\":\"Shuyu Wang;Dongling Liu;Changzeng Fu;Xiaoming Yuan;Peng Shan;Victor C.M. Leung\",\"doi\":\"10.1109/TRO.2025.3610170\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Predicting the causal flow by fusing multimodal perception is fundamental for constructing the bodily awareness of soft robots. However, forming such a predictive model while fusing the multimodal sensory data of soft robots remains challenging and less explored. In this study, we leverage the free energy principle within a Bayesian probabilistic deep learning framework to merge visual, pressure, and flex sensing signals. Our proposed multimodal association mechanism enhances the fusion process, establishing a robust computational methodology. We train the model using a newly collected dataset that captures the grasping dynamics of a soft gripper equipped with multimodal perception capabilities. By incorporating the current state and image differences, the forward model can predict the soft gripper’s physical interaction and movement in the image flow, which amounts to imagining future motion events. Moreover, we showcase effective predictions across modalities as well as for grasping outcomes. Notably, our enhanced variational autoencoder approach can pave the way for unprecedented possibilities of bodily awareness in soft robotics.\",\"PeriodicalId\":50388,\"journal\":{\"name\":\"IEEE Transactions on Robotics\",\"volume\":\"41 \",\"pages\":\"5663-5678\"},\"PeriodicalIF\":10.5000,\"publicationDate\":\"2025-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Robotics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11165000/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Robotics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11165000/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}
Predictive Body Awareness in Soft Robots: A Bayesian Variational Autoencoder Fusing Multimodal Sensory Data
Predicting the causal flow by fusing multimodal perception is fundamental for constructing the bodily awareness of soft robots. However, forming such a predictive model while fusing the multimodal sensory data of soft robots remains challenging and less explored. In this study, we leverage the free energy principle within a Bayesian probabilistic deep learning framework to merge visual, pressure, and flex sensing signals. Our proposed multimodal association mechanism enhances the fusion process, establishing a robust computational methodology. We train the model using a newly collected dataset that captures the grasping dynamics of a soft gripper equipped with multimodal perception capabilities. By incorporating the current state and image differences, the forward model can predict the soft gripper’s physical interaction and movement in the image flow, which amounts to imagining future motion events. Moreover, we showcase effective predictions across modalities as well as for grasping outcomes. Notably, our enhanced variational autoencoder approach can pave the way for unprecedented possibilities of bodily awareness in soft robotics.
期刊介绍:
The IEEE Transactions on Robotics (T-RO) is dedicated to publishing fundamental papers covering all facets of robotics, drawing on interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, and beyond. From industrial applications to service and personal assistants, surgical operations to space, underwater, and remote exploration, robots and intelligent machines play pivotal roles across various domains, including entertainment, safety, search and rescue, military applications, agriculture, and intelligent vehicles.
Special emphasis is placed on intelligent machines and systems designed for unstructured environments, where a significant portion of the environment remains unknown and beyond direct sensing or control.