NeuralFeels with neural fields: Visuotactile perception for in-hand manipulation

IF 26.1 1区计算机科学 Q1 ROBOTICS

Science Robotics Pub Date : 2024-11-13 DOI:10.1126/scirobotics.adl0628

Sudharshan Suresh, Haozhi Qi, Tingfan Wu, Taosha Fan, Luis Pineda, Mike Lambeta, Jitendra Malik, Mrinal Kalakrishnan, Roberto Calandra, Michael Kaess, Joseph Ortiz, Mustafa Mukadam

{"title":"NeuralFeels with neural fields: Visuotactile perception for in-hand manipulation","authors":"Sudharshan Suresh, Haozhi Qi, Tingfan Wu, Taosha Fan, Luis Pineda, Mike Lambeta, Jitendra Malik, Mrinal Kalakrishnan, Roberto Calandra, Michael Kaess, Joseph Ortiz, Mustafa Mukadam","doi":"10.1126/scirobotics.adl0628","DOIUrl":null,"url":null,"abstract":"<div >To achieve human-level dexterity, robots must infer spatial awareness from multimodal sensing to reason over contact interactions. During in-hand manipulation of novel objects, such spatial awareness involves estimating the object’s pose and shape. The status quo for in-hand perception primarily uses vision and is restricted to tracking a priori known objects. Moreover, visual occlusion of objects in hand is imminent during manipulation, preventing current systems from pushing beyond tasks without occlusion. We combined vision and touch sensing on a multifingered hand to estimate an object’s pose and shape during in-hand manipulation. Our method, NeuralFeels, encodes object geometry by learning a neural field online and jointly tracks it by optimizing a pose graph problem. We studied multimodal in-hand perception in simulation and the real world, interacting with different objects via a proprioception-driven policy. Our experiments showed final reconstruction <i>F</i> scores of 81% and average pose drifts of 4.7 millimeters, which was further reduced to 2.3 millimeters with known object models. In addition, we observed that, under heavy visual occlusion, we could achieve improvements in tracking up to 94% compared with vision-only methods. Our results demonstrate that touch, at the very least, refines and, at the very best, disambiguates visual estimates during in-hand manipulation. We release our evaluation dataset of 70 experiments, FeelSight, as a step toward benchmarking in this domain. Our neural representation driven by multimodal sensing can serve as a perception backbone toward advancing robot dexterity.</div>","PeriodicalId":56029,"journal":{"name":"Science Robotics","volume":"9 96","pages":""},"PeriodicalIF":26.1000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.science.org/doi/reader/10.1126/scirobotics.adl0628","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science Robotics","FirstCategoryId":"94","ListUrlMain":"https://www.science.org/doi/10.1126/scirobotics.adl0628","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

To achieve human-level dexterity, robots must infer spatial awareness from multimodal sensing to reason over contact interactions. During in-hand manipulation of novel objects, such spatial awareness involves estimating the object’s pose and shape. The status quo for in-hand perception primarily uses vision and is restricted to tracking a priori known objects. Moreover, visual occlusion of objects in hand is imminent during manipulation, preventing current systems from pushing beyond tasks without occlusion. We combined vision and touch sensing on a multifingered hand to estimate an object’s pose and shape during in-hand manipulation. Our method, NeuralFeels, encodes object geometry by learning a neural field online and jointly tracks it by optimizing a pose graph problem. We studied multimodal in-hand perception in simulation and the real world, interacting with different objects via a proprioception-driven policy. Our experiments showed final reconstruction F scores of 81% and average pose drifts of 4.7 millimeters, which was further reduced to 2.3 millimeters with known object models. In addition, we observed that, under heavy visual occlusion, we could achieve improvements in tracking up to 94% compared with vision-only methods. Our results demonstrate that touch, at the very least, refines and, at the very best, disambiguates visual estimates during in-hand manipulation. We release our evaluation dataset of 70 experiments, FeelSight, as a step toward benchmarking in this domain. Our neural representation driven by multimodal sensing can serve as a perception backbone toward advancing robot dexterity.

Abstract Image

查看原文本刊更多论文

神经感觉与神经场手部操作的视觉触觉感知

要达到人类的灵巧程度，机器人必须通过多模态传感来推断空间意识，从而对接触互动进行推理。在用手操作新物体的过程中，这种空间感知包括估计物体的姿势和形状。手部感知的现状主要是使用视觉，而且仅限于跟踪先验的已知物体。此外，在操作过程中，手持物体的视觉遮挡问题迫在眉睫，使得当前的系统无法在没有遮挡的情况下完成更多任务。我们将多指手部的视觉和触摸感应结合起来，在手部操作过程中估计物体的姿势和形状。我们的方法，即 NeuralFeels，通过在线学习神经场来编码物体的几何形状，并通过优化姿势图问题来联合跟踪它。我们研究了模拟和真实世界中的多模态手部感知，通过本体感觉驱动策略与不同物体进行交互。我们的实验表明，最终的重建 F 分数为 81%，平均姿势漂移为 4.7 毫米，在已知物体模型的情况下，漂移进一步减少到 2.3 毫米。此外，我们还观察到，在严重视觉遮挡的情况下，与纯视觉方法相比，我们的跟踪性能提高了 94%。我们的研究结果表明，在手部操作过程中，触摸至少可以完善视觉估计，最好还能消除视觉估计的歧义。我们发布了由 70 个实验组成的评估数据集 FeelSight，以此作为该领域的基准。我们的神经表征由多模态传感驱动，可以作为感知骨干，提高机器人的灵巧性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Science Robotics Mathematics-Control and Optimization

CiteScore

30.60

自引率

2.80%

发文量

期刊介绍： Science Robotics publishes original, peer-reviewed, science- or engineering-based research articles that advance the field of robotics. The journal also features editor-commissioned Reviews. An international team of academic editors holds Science Robotics articles to the same high-quality standard that is the hallmark of the Science family of journals. Sub-topics include: actuators, advanced materials, artificial Intelligence, autonomous vehicles, bio-inspired design, exoskeletons, fabrication, field robotics, human-robot interaction, humanoids, industrial robotics, kinematics, machine learning, material science, medical technology, motion planning and control, micro- and nano-robotics, multi-robot control, sensors, service robotics, social and ethical issues, soft robotics, and space, planetary and undersea exploration.