NeuralFeels with neural fields: Visuotactile perception for in-hand manipulation

IF 26.1 1区 计算机科学 Q1 ROBOTICS
Sudharshan Suresh, Haozhi Qi, Tingfan Wu, Taosha Fan, Luis Pineda, Mike Lambeta, Jitendra Malik, Mrinal Kalakrishnan, Roberto Calandra, Michael Kaess, Joseph Ortiz, Mustafa Mukadam
{"title":"NeuralFeels with neural fields: Visuotactile perception for in-hand manipulation","authors":"Sudharshan Suresh,&nbsp;Haozhi Qi,&nbsp;Tingfan Wu,&nbsp;Taosha Fan,&nbsp;Luis Pineda,&nbsp;Mike Lambeta,&nbsp;Jitendra Malik,&nbsp;Mrinal Kalakrishnan,&nbsp;Roberto Calandra,&nbsp;Michael Kaess,&nbsp;Joseph Ortiz,&nbsp;Mustafa Mukadam","doi":"10.1126/scirobotics.adl0628","DOIUrl":null,"url":null,"abstract":"<div >To achieve human-level dexterity, robots must infer spatial awareness from multimodal sensing to reason over contact interactions. During in-hand manipulation of novel objects, such spatial awareness involves estimating the object’s pose and shape. The status quo for in-hand perception primarily uses vision and is restricted to tracking a priori known objects. Moreover, visual occlusion of objects in hand is imminent during manipulation, preventing current systems from pushing beyond tasks without occlusion. We combined vision and touch sensing on a multifingered hand to estimate an object’s pose and shape during in-hand manipulation. Our method, NeuralFeels, encodes object geometry by learning a neural field online and jointly tracks it by optimizing a pose graph problem. We studied multimodal in-hand perception in simulation and the real world, interacting with different objects via a proprioception-driven policy. Our experiments showed final reconstruction <i>F</i> scores of 81% and average pose drifts of 4.7 millimeters, which was further reduced to 2.3 millimeters with known object models. In addition, we observed that, under heavy visual occlusion, we could achieve improvements in tracking up to 94% compared with vision-only methods. Our results demonstrate that touch, at the very least, refines and, at the very best, disambiguates visual estimates during in-hand manipulation. We release our evaluation dataset of 70 experiments, FeelSight, as a step toward benchmarking in this domain. Our neural representation driven by multimodal sensing can serve as a perception backbone toward advancing robot dexterity.</div>","PeriodicalId":56029,"journal":{"name":"Science Robotics","volume":"9 96","pages":""},"PeriodicalIF":26.1000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.science.org/doi/reader/10.1126/scirobotics.adl0628","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science Robotics","FirstCategoryId":"94","ListUrlMain":"https://www.science.org/doi/10.1126/scirobotics.adl0628","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

Abstract

To achieve human-level dexterity, robots must infer spatial awareness from multimodal sensing to reason over contact interactions. During in-hand manipulation of novel objects, such spatial awareness involves estimating the object’s pose and shape. The status quo for in-hand perception primarily uses vision and is restricted to tracking a priori known objects. Moreover, visual occlusion of objects in hand is imminent during manipulation, preventing current systems from pushing beyond tasks without occlusion. We combined vision and touch sensing on a multifingered hand to estimate an object’s pose and shape during in-hand manipulation. Our method, NeuralFeels, encodes object geometry by learning a neural field online and jointly tracks it by optimizing a pose graph problem. We studied multimodal in-hand perception in simulation and the real world, interacting with different objects via a proprioception-driven policy. Our experiments showed final reconstruction F scores of 81% and average pose drifts of 4.7 millimeters, which was further reduced to 2.3 millimeters with known object models. In addition, we observed that, under heavy visual occlusion, we could achieve improvements in tracking up to 94% compared with vision-only methods. Our results demonstrate that touch, at the very least, refines and, at the very best, disambiguates visual estimates during in-hand manipulation. We release our evaluation dataset of 70 experiments, FeelSight, as a step toward benchmarking in this domain. Our neural representation driven by multimodal sensing can serve as a perception backbone toward advancing robot dexterity.

Abstract Image

神经感觉与神经场手部操作的视觉触觉感知
要达到人类的灵巧程度,机器人必须通过多模态传感来推断空间意识,从而对接触互动进行推理。在用手操作新物体的过程中,这种空间感知包括估计物体的姿势和形状。手部感知的现状主要是使用视觉,而且仅限于跟踪先验的已知物体。此外,在操作过程中,手持物体的视觉遮挡问题迫在眉睫,使得当前的系统无法在没有遮挡的情况下完成更多任务。我们将多指手部的视觉和触摸感应结合起来,在手部操作过程中估计物体的姿势和形状。我们的方法,即 NeuralFeels,通过在线学习神经场来编码物体的几何形状,并通过优化姿势图问题来联合跟踪它。我们研究了模拟和真实世界中的多模态手部感知,通过本体感觉驱动策略与不同物体进行交互。我们的实验表明,最终的重建 F 分数为 81%,平均姿势漂移为 4.7 毫米,在已知物体模型的情况下,漂移进一步减少到 2.3 毫米。此外,我们还观察到,在严重视觉遮挡的情况下,与纯视觉方法相比,我们的跟踪性能提高了 94%。我们的研究结果表明,在手部操作过程中,触摸至少可以完善视觉估计,最好还能消除视觉估计的歧义。我们发布了由 70 个实验组成的评估数据集 FeelSight,以此作为该领域的基准。我们的神经表征由多模态传感驱动,可以作为感知骨干,提高机器人的灵巧性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Science Robotics
Science Robotics Mathematics-Control and Optimization
CiteScore
30.60
自引率
2.80%
发文量
83
期刊介绍: Science Robotics publishes original, peer-reviewed, science- or engineering-based research articles that advance the field of robotics. The journal also features editor-commissioned Reviews. An international team of academic editors holds Science Robotics articles to the same high-quality standard that is the hallmark of the Science family of journals. Sub-topics include: actuators, advanced materials, artificial Intelligence, autonomous vehicles, bio-inspired design, exoskeletons, fabrication, field robotics, human-robot interaction, humanoids, industrial robotics, kinematics, machine learning, material science, medical technology, motion planning and control, micro- and nano-robotics, multi-robot control, sensors, service robotics, social and ethical issues, soft robotics, and space, planetary and undersea exploration.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信