Finn Lukas Busch, Timon Homberger, Jesús Ortega-Peimbert, Quantao Yang, Olov Andersson
{"title":"一张地图找所有:零距离多目标导航的实时开放词汇映射","authors":"Finn Lukas Busch, Timon Homberger, Jesús Ortega-Peimbert, Quantao Yang, Olov Andersson","doi":"arxiv-2409.11764","DOIUrl":null,"url":null,"abstract":"The capability to efficiently search for objects in complex environments is\nfundamental for many real-world robot applications. Recent advances in\nopen-vocabulary vision models have resulted in semantically-informed object\nnavigation methods that allow a robot to search for an arbitrary object without\nprior training. However, these zero-shot methods have so far treated the\nenvironment as unknown for each consecutive query. In this paper we introduce a\nnew benchmark for zero-shot multi-object navigation, allowing the robot to\nleverage information gathered from previous searches to more efficiently find\nnew objects. To address this problem we build a reusable open-vocabulary\nfeature map tailored for real-time object search. We further propose a\nprobabilistic-semantic map update that mitigates common sources of errors in\nsemantic feature extraction and leverage this semantic uncertainty for informed\nmulti-object exploration. We evaluate our method on a set of object navigation\ntasks in both simulation as well as with a real robot, running in real-time on\na Jetson Orin AGX. We demonstrate that it outperforms existing state-of-the-art\napproaches both on single and multi-object navigation tasks. Additional videos,\ncode and the multi-object navigation benchmark will be available on\nhttps://finnbsch.github.io/OneMap.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":"98 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"One Map to Find Them All: Real-time Open-Vocabulary Mapping for Zero-shot Multi-Object Navigation\",\"authors\":\"Finn Lukas Busch, Timon Homberger, Jesús Ortega-Peimbert, Quantao Yang, Olov Andersson\",\"doi\":\"arxiv-2409.11764\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The capability to efficiently search for objects in complex environments is\\nfundamental for many real-world robot applications. Recent advances in\\nopen-vocabulary vision models have resulted in semantically-informed object\\nnavigation methods that allow a robot to search for an arbitrary object without\\nprior training. However, these zero-shot methods have so far treated the\\nenvironment as unknown for each consecutive query. In this paper we introduce a\\nnew benchmark for zero-shot multi-object navigation, allowing the robot to\\nleverage information gathered from previous searches to more efficiently find\\nnew objects. To address this problem we build a reusable open-vocabulary\\nfeature map tailored for real-time object search. We further propose a\\nprobabilistic-semantic map update that mitigates common sources of errors in\\nsemantic feature extraction and leverage this semantic uncertainty for informed\\nmulti-object exploration. We evaluate our method on a set of object navigation\\ntasks in both simulation as well as with a real robot, running in real-time on\\na Jetson Orin AGX. We demonstrate that it outperforms existing state-of-the-art\\napproaches both on single and multi-object navigation tasks. Additional videos,\\ncode and the multi-object navigation benchmark will be available on\\nhttps://finnbsch.github.io/OneMap.\",\"PeriodicalId\":501031,\"journal\":{\"name\":\"arXiv - CS - Robotics\",\"volume\":\"98 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11764\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11764","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
在复杂环境中高效搜索物体的能力是现实世界中许多机器人应用的基础。开放词汇视觉模型的最新进展带来了基于语义的物体导航方法,使机器人无需事先训练即可搜索任意物体。然而,迄今为止,这些 "零镜头 "方法在每次连续查询时都将环境视为未知。在本文中,我们引入了一种新的零点多目标导航基准,允许机器人利用从之前搜索中收集到的信息,更高效地找到新目标。为了解决这个问题,我们为实时物体搜索量身定制了一个可重复使用的开放词汇特征图。我们进一步提出了一种可减少语义特征提取中常见错误来源的robabilistic语义地图更新方法,并利用这种语义不确定性进行有依据的多对象探索。我们通过在 Jetson Orin AGX 上实时运行的一组对象导航任务,对我们的方法进行了模拟和真实机器人评估。结果表明,在单目标和多目标导航任务上,我们的方法都优于现有的先进方法。更多视频、代码和多目标导航基准将在https://finnbsch.github.io/OneMap。
One Map to Find Them All: Real-time Open-Vocabulary Mapping for Zero-shot Multi-Object Navigation
The capability to efficiently search for objects in complex environments is
fundamental for many real-world robot applications. Recent advances in
open-vocabulary vision models have resulted in semantically-informed object
navigation methods that allow a robot to search for an arbitrary object without
prior training. However, these zero-shot methods have so far treated the
environment as unknown for each consecutive query. In this paper we introduce a
new benchmark for zero-shot multi-object navigation, allowing the robot to
leverage information gathered from previous searches to more efficiently find
new objects. To address this problem we build a reusable open-vocabulary
feature map tailored for real-time object search. We further propose a
probabilistic-semantic map update that mitigates common sources of errors in
semantic feature extraction and leverage this semantic uncertainty for informed
multi-object exploration. We evaluate our method on a set of object navigation
tasks in both simulation as well as with a real robot, running in real-time on
a Jetson Orin AGX. We demonstrate that it outperforms existing state-of-the-art
approaches both on single and multi-object navigation tasks. Additional videos,
code and the multi-object navigation benchmark will be available on
https://finnbsch.github.io/OneMap.