FROMFusion: Fast and Robust On-Manifold Dense Reconstruction for Low-Cost Wheeled Robots

IF 5.6 2区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Instrumentation and Measurement Pub Date : 2024-10-16 DOI:10.1109/TIM.2024.3481537

Minjie Bao;Junguo Fan;Zhendong Fan;Runze Xu;Ke Wang;Chaonan Mu;Ruifeng Li;Hewen Zhou;Peng Kang

{"title":"FROMFusion: Fast and Robust On-Manifold Dense Reconstruction for Low-Cost Wheeled Robots","authors":"Minjie Bao;Junguo Fan;Zhendong Fan;Runze Xu;Ke Wang;Chaonan Mu;Ruifeng Li;Hewen Zhou;Peng Kang","doi":"10.1109/TIM.2024.3481537","DOIUrl":null,"url":null,"abstract":"Existing voxel-hashing (VH)-based dense reconstruction methods have shown impressive results on datasets collected by hand-held cameras. Large-scale scenes are maintained with a truncated signed distance function (TSDF) volumetric representation. However, practically deploying such methods on low-cost embedded mobile robots remains challenging due to heavy computational burdens and various camera perception degeneration cases. In this work, we propose FROMFusion, a fast and robust on-manifold dense reconstruction framework based on multisensor fusion, which systematically solves how to align the point cloud with the hashed TSDF volume (HTV). Its purely geometric nature ensures the robustness to image motion blur and poor lighting conditions. To reduce memory overhead, we propose a spherical-coordinate-based HTV segmentation algorithm. To surmount missing geometric features, camera occlusion, and over range, a loosely coupled LiDAR-wheel-inertial odometry (LWIO) is applied for trustworthy initial guesses in camera pose optimization. At its core is a two-stage depth-to-HTV matching algorithm, which includes a coarse voxel-level pose ergodic search and a fine subvoxel-level Gauss-Newton (GN) solver with Anderson acceleration (AA) strategy for faster convergence. We evenly distribute heavy computational workloads to heterogeneous computing systems. Extensive field experiments on a low-cost wheeled robot cleaner demonstrate our method models continuous surfaces of large-scale scenes with high quality in both geometry and texture, outperforming current state-of-the-art methods in robustness to camera perception degeneration cases by a significant margin. The frame rate of online embedded implementation can reach up to 47.21 Hz maximum.","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":"73 ","pages":"1-18"},"PeriodicalIF":5.6000,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10720077/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Existing voxel-hashing (VH)-based dense reconstruction methods have shown impressive results on datasets collected by hand-held cameras. Large-scale scenes are maintained with a truncated signed distance function (TSDF) volumetric representation. However, practically deploying such methods on low-cost embedded mobile robots remains challenging due to heavy computational burdens and various camera perception degeneration cases. In this work, we propose FROMFusion, a fast and robust on-manifold dense reconstruction framework based on multisensor fusion, which systematically solves how to align the point cloud with the hashed TSDF volume (HTV). Its purely geometric nature ensures the robustness to image motion blur and poor lighting conditions. To reduce memory overhead, we propose a spherical-coordinate-based HTV segmentation algorithm. To surmount missing geometric features, camera occlusion, and over range, a loosely coupled LiDAR-wheel-inertial odometry (LWIO) is applied for trustworthy initial guesses in camera pose optimization. At its core is a two-stage depth-to-HTV matching algorithm, which includes a coarse voxel-level pose ergodic search and a fine subvoxel-level Gauss-Newton (GN) solver with Anderson acceleration (AA) strategy for faster convergence. We evenly distribute heavy computational workloads to heterogeneous computing systems. Extensive field experiments on a low-cost wheeled robot cleaner demonstrate our method models continuous surfaces of large-scale scenes with high quality in both geometry and texture, outperforming current state-of-the-art methods in robustness to camera perception degeneration cases by a significant margin. The frame rate of online embedded implementation can reach up to 47.21 Hz maximum.

查看原文本刊更多论文

FROMFusion：面向低成本轮式机器人的快速、稳健曲面密集重构技术

现有的基于体素散列（VH）的密集重建方法在手持相机采集的数据集上取得了令人瞩目的成果。使用截断符号距离函数（TSDF）的体积表示法可以维持大尺度场景。然而，由于沉重的计算负担和各种摄像头感知退化情况，在低成本嵌入式移动机器人上实际部署此类方法仍具有挑战性。在这项工作中，我们提出了 FROMFusion，这是一种基于多传感器融合的快速、稳健的manifold上密集重建框架，它系统地解决了如何将点云与哈希TSDF体积（HTV）对齐的问题。它的纯几何性质确保了对图像运动模糊和光照条件差的鲁棒性。为了减少内存开销，我们提出了一种基于球坐标的 HTV 分割算法。为了克服几何特征缺失、相机遮挡和超距等问题，我们在相机姿态优化中采用了松散耦合的激光雷达-轮子-惯性里程测量法（LWIO）来获得可信的初始猜测。其核心是两阶段深度-HTV 匹配算法，包括粗体素级姿态遍历搜索和精细子体素级高斯-牛顿（GN）求解器，并采用安德森加速（AA）策略以加快收敛速度。我们将繁重的计算工作量平均分配给异构计算系统。在低成本轮式机器人清洁器上进行的大量现场实验表明，我们的方法能对大尺度场景的连续表面进行几何和纹理方面的高质量建模，在相机感知退化情况下的鲁棒性显著优于目前最先进的方法。在线嵌入式实现的帧速率最高可达 47.21 Hz。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Instrumentation and Measurement 工程技术-工程：电子与电气

CiteScore

9.00

自引率

23.20%

发文量

1294

审稿时长

3.9 months

期刊介绍： Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.