ObjectFusion: Accurate object-level SLAM with neural object priors

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Graphical Models Pub Date : 2022-09-01 DOI:10.1016/j.gmod.2022.101165

Zi-Xin Zou , Shi-Sheng Huang , Tai-Jiang Mu , Yu-Ping Wang

{"title":"ObjectFusion: Accurate object-level SLAM with neural object priors","authors":"Zi-Xin Zou , Shi-Sheng Huang , Tai-Jiang Mu , Yu-Ping Wang","doi":"10.1016/j.gmod.2022.101165","DOIUrl":null,"url":null,"abstract":"<div>Previous object-level Simultaneous Localization and Mapping (SLAM) approaches still fail to create high quality object-oriented 3D map in an efficient way. The main challenges come from how to represent the object shape effectively and how to apply such object representation to accurate online camera tracking efficiently. In this paper, we provide ObjectFusion as a novel object-level SLAM in static scenes which efficiently creates object-oriented 3D map with high-quality object reconstruction, by leveraging neural object priors. We propose a neural object representation with only a single encoder–decoder network to effectively express the object shape across various categories, which benefits high quality reconstruction of object instance. More importantly, we propose to convert such neural object representation as precise measurements to jointly optimize the object shape, object pose and camera pose for the final accurate 3D object reconstruction. With extensive evaluations on synthetic and real-world RGB-D datasets, we show that our ObjectFusion outperforms previous approaches, with better object reconstruction quality, using much less memory footprint, and in a more efficient way, especially at the object level.</div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"123 ","pages":"Article 101165"},"PeriodicalIF":2.5000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Graphical Models","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1524070322000418","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 5

Abstract

Previous object-level Simultaneous Localization and Mapping (SLAM) approaches still fail to create high quality object-oriented 3D map in an efficient way. The main challenges come from how to represent the object shape effectively and how to apply such object representation to accurate online camera tracking efficiently. In this paper, we provide ObjectFusion as a novel object-level SLAM in static scenes which efficiently creates object-oriented 3D map with high-quality object reconstruction, by leveraging neural object priors. We propose a neural object representation with only a single encoder–decoder network to effectively express the object shape across various categories, which benefits high quality reconstruction of object instance. More importantly, we propose to convert such neural object representation as precise measurements to jointly optimize the object shape, object pose and camera pose for the final accurate 3D object reconstruction. With extensive evaluations on synthetic and real-world RGB-D datasets, we show that our ObjectFusion outperforms previous approaches, with better object reconstruction quality, using much less memory footprint, and in a more efficient way, especially at the object level.

Abstract Image

查看原文本刊更多论文

目标融合:具有神经目标先验的精确目标级SLAM

以往的对象级同步定位与映射(SLAM)方法仍然不能有效地生成高质量的面向对象三维地图。主要的挑战是如何有效地表示物体形状，以及如何将这种物体表示有效地应用于准确的在线摄像机跟踪。在本文中，我们提供了ObjectFusion作为静态场景中一种新的对象级SLAM，通过利用神经对象先验，有效地创建具有高质量对象重建的面向对象3D地图。我们提出了一种仅使用单个编码器-解码器网络的神经对象表示方法，可以有效地表达不同类别的对象形状，从而有利于高质量的对象实例重建。更重要的是，我们提出将这种神经对象表示转换为精确测量，共同优化物体形状、物体姿态和相机姿态，以最终精确地重建三维物体。通过对合成和真实世界RGB-D数据集的广泛评估，我们表明我们的ObjectFusion优于以前的方法，具有更好的对象重建质量，使用更少的内存占用，并且以更有效的方式，特别是在对象级别。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Graphical Models 工程技术-计算机：软件工程

CiteScore

3.60

自引率

5.90%

发文量

审稿时长

47 days

期刊介绍： Graphical Models is recognized internationally as a highly rated, top tier journal and is focused on the creation, geometric processing, animation, and visualization of graphical models and on their applications in engineering, science, culture, and entertainment. GMOD provides its readers with thoroughly reviewed and carefully selected papers that disseminate exciting innovations, that teach rigorous theoretical foundations, that propose robust and efficient solutions, or that describe ambitious systems or applications in a variety of topics. We invite papers in five categories: research (contributions of novel theoretical or practical approaches or solutions), survey (opinionated views of the state-of-the-art and challenges in a specific topic), system (the architecture and implementation details of an innovative architecture for a complete system that supports model/animation design, acquisition, analysis, visualization?), application (description of a novel application of know techniques and evaluation of its impact), or lecture (an elegant and inspiring perspective on previously published results that clarifies them and teaches them in a new way). GMOD offers its authors an accelerated review, feedback from experts in the field, immediate online publication of accepted papers, no restriction on color and length (when justified by the content) in the online version, and a broad promotion of published papers. A prestigious group of editors selected from among the premier international researchers in their fields oversees the review process.