IEEE Robotics and Automation Letters最新文献

筛选
英文 中文
DVRP-MHSI: Dynamic Visualization Research Platform for Multimodal Human-Swarm Interaction 多模态人群交互动态可视化研究平台
IF 4.6 2区 计算机科学
IEEE Robotics and Automation Letters Pub Date : 2025-04-14 DOI: 10.1109/LRA.2025.3560825
Pengming Zhu;Zhiwen Zeng;Weijia Yao;Wei Dai;Huimin Lu;Zongtan Zhou
{"title":"DVRP-MHSI: Dynamic Visualization Research Platform for Multimodal Human-Swarm Interaction","authors":"Pengming Zhu;Zhiwen Zeng;Weijia Yao;Wei Dai;Huimin Lu;Zongtan Zhou","doi":"10.1109/LRA.2025.3560825","DOIUrl":"https://doi.org/10.1109/LRA.2025.3560825","url":null,"abstract":"In recent years, there has been a significant amount of research on algorithms and control methods for distributed collaborative robots. However, the emergence of collective behavior in a swarm is still difficult to predict and control. Nevertheless, human interaction with the swarm helps render the swarm more predictable and controllable, as human operators can utilize intuition or knowledge that is not always available to the swarm. Therefore, this letter designs the Dynamic Visualization Research Platform for Multimodal Human-Swarm Interaction (DVRP-MHSI), which is an innovative open system that can perform real-time dynamic visualization and is specifically designed to accommodate a multitude of interaction modalities (such as brain-computer, eye-tracking, electromyographic, and touch-based interfaces), thereby expediting progress in human-swarm interaction research. Specifically, the platform consists of custom-made low-cost omnidirectional wheeled mobile robots, multitouch screens and two workstations. In particular, the mutitouch screens can recognize human gestures and the shapes of objects placed on them, and they can also dynamically render diverse scenes. One of the workstations processes communication information within robots and the other one implements human-robot interaction methods. The development of DVRP-MHSI frees researchers from hardware or software details and allows them to focus on versatile swarm algorithms and human-swarm interaction methods without being limited to predefined and static scenarios, tasks, and interfaces. The effectiveness and potential of the platform for human-swarm interaction studies are validated by several demonstrative experiments.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"5665-5672"},"PeriodicalIF":4.6,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10964715","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143883301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HAS-RRT: RRT-Based Motion Planning Using Topological Guidance 使用拓扑引导的基于rrt的运动规划
IF 4.6 2区 计算机科学
IEEE Robotics and Automation Letters Pub Date : 2025-04-14 DOI: 10.1109/LRA.2025.3560878
Diane Uwacu;Ananya Yammanuru;Keerthana Nallamotu;Vasu Chalasani;Marco Morales;Nancy M. Amato
{"title":"HAS-RRT: RRT-Based Motion Planning Using Topological Guidance","authors":"Diane Uwacu;Ananya Yammanuru;Keerthana Nallamotu;Vasu Chalasani;Marco Morales;Nancy M. Amato","doi":"10.1109/LRA.2025.3560878","DOIUrl":"https://doi.org/10.1109/LRA.2025.3560878","url":null,"abstract":"We present a hierarchical RRT-based motion planning strategy, Hierarchical Annotated-Skeleton Guided RRT (HAS-RRT), guided by a workspace skeleton, to solve motion planning problems. HAS-RRT provides up to a 91% runtime reduction and builds a tree at least 30% smaller than competitors while still finding competitive-cost paths. This is because our strategy prioritizes paths indicated by the workspace guidance to efficiently find a valid motion plan for the robot. Existing methods either rely too heavily on workspace guidance or have difficulty finding narrow passages. By taking advantage of the assumptions that the workspace skeleton provides, HAS-RRT is able to build a smaller tree and find a path faster than its competitors. Additionally, we show that HAS-RRT is robust to the quality of workspace guidance provided and that, in a worst-case scenario where the workspace skeleton provides no additional insight, our method performs comparably to an unguided method.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"6223-6230"},"PeriodicalIF":4.6,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10964851","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143938006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
REALM: Real-Time Estimates of Assistance for Learned Models in Human-Robot Interaction 领域:人机交互中学习模型的实时评估
IF 4.6 2区 计算机科学
IEEE Robotics and Automation Letters Pub Date : 2025-04-14 DOI: 10.1109/LRA.2025.3560862
Michael Hagenow;Julie A. Shah
{"title":"REALM: Real-Time Estimates of Assistance for Learned Models in Human-Robot Interaction","authors":"Michael Hagenow;Julie A. Shah","doi":"10.1109/LRA.2025.3560862","DOIUrl":"https://doi.org/10.1109/LRA.2025.3560862","url":null,"abstract":"There are a variety of mechanisms (i.e., input types) for real-time human interaction that can facilitate effective human-robot teaming. For example, previous works have shown how teleoperation, corrective, and discrete (i.e., preference over a small number of choices) input can enable robots to complete complex tasks. However, few previous works have looked at combining different methods, and in particular, opportunities for a robot to estimate and elicit the most effective form of assistance given its understanding of a task. In this letter, we propose a method for estimating the value of different human assistance mechanisms based on the action uncertainty of a robot policy. Our key idea is to construct mathematical expressions for the expected post-interaction differential entropy (i.e., uncertainty) of a stochastic robot policy to compare the expected value of different interactions. As each type of human input imposes a different requirement for human involvement, we demonstrate how differential entropy estimates can be combined with a likelihood penalization approach to effectively balance feedback informational needs with the level of required input. We demonstrate evidence of how our approach interfaces with emergent learning models (e.g., a diffusion model) to produce accurate assistance value estimates through both simulation and a robot user study. Our user study results indicate that the proposed approach can enable task completion with minimal human feedback for uncertain robot behaviors.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"5473-5480"},"PeriodicalIF":4.6,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143888232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FS$^{2}$D: Fully Sparse Few-Shot 3D Object Detection FS$^{2}$D:完全稀疏的少镜头3D物体检测
IF 4.6 2区 计算机科学
IEEE Robotics and Automation Letters Pub Date : 2025-04-14 DOI: 10.1109/LRA.2025.3560868
Chunzheng Li;Gaihua Wang;Zeng Liang;Qian Long;Zhengshu Zhou;Xuran Pan
{"title":"FS$^{2}$D: Fully Sparse Few-Shot 3D Object Detection","authors":"Chunzheng Li;Gaihua Wang;Zeng Liang;Qian Long;Zhengshu Zhou;Xuran Pan","doi":"10.1109/LRA.2025.3560868","DOIUrl":"https://doi.org/10.1109/LRA.2025.3560868","url":null,"abstract":"Corner cases are a focal issue in current autonomous driving systems, with a significant portion attributed to few-shot detection. Due to the sparse distribution of point cloud data and the real-time requirements of autonomous driving, traditional few-shot detection methods face challenges in direct application to the 3D domain, making it more difficult for outdoor scene 3D detectors to handle corner cases. In this study, we employ fully sparse feature matching and aggregation operations, utilizing meta-learning methods to enhance performance on few-shot categories without increasing network inference parameters. Furthermore, our few-shot research is based on the inherent characteristics of publicly available data without introducing additional categories, allowing for fair comparisons with existing methods. Extensive experiments were conducted on the widely used nuScenes dataset to validate the effectiveness of our method. We demonstrate superior performance compared to the baseline method, especially in handling few-shot categories.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"5847-5854"},"PeriodicalIF":4.6,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143902689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LiVeDet: Lightweight Density-Guided Adaptive Transformer for Online On-Device Vessel Detection LiVeDet:用于在线设备上船舶检测的轻量级密度导向自适应变压器
IF 4.6 2区 计算机科学
IEEE Robotics and Automation Letters Pub Date : 2025-04-10 DOI: 10.1109/LRA.2025.3559834
Zijie Zhang;Changhong Fu;Yongkang Cao;Mengyuan Li;Haobo Zuo
{"title":"LiVeDet: Lightweight Density-Guided Adaptive Transformer for Online On-Device Vessel Detection","authors":"Zijie Zhang;Changhong Fu;Yongkang Cao;Mengyuan Li;Haobo Zuo","doi":"10.1109/LRA.2025.3559834","DOIUrl":"https://doi.org/10.1109/LRA.2025.3559834","url":null,"abstract":"Vision-based online vessel detection boosts the automation of waterways monitoring, transportation management and navigation safety. However, a significant gap exists in on-device deployment between general high-performance PCs/servers and embedded AI processors. Existing state-of-the-art (SOTA) online vessel detectors lack sufficient accuracy and are prone to high latency on the edge AI camera, especially in scenarios with dense vessels and diverse distributions. To solve the above issues, a novel lightweight framework with density-guided adaptive Transformer (LiVeDet) is proposed for the edge AI camera to achieve online on-device vessel detection. Specifically, a new instance-aware representation extractor is designed to suppress cluttered background noise and capture instance-aware content information. Additionally, an innovative vessel distribution estimator is developed to direct superior feature representation learning by focusing on local regions with varying vessel density. Besides, a novel dynamic region embedding is presented to integrate hierarchical features represented by multi-scale vessels. A new benchmark comprising 100 high-definition, high-framerate video sequences from vessel-intensive scenarios is established to evaluate the efficacy of vessel detectors under challenging conditions prevalent in dynamic waterways. Extensive evaluations on this challenging benchmark demonstrate the robustness and efficiency of LiVeDet, achieving 32.9 FPS on the edge AI camera. Furthermore, real-world applications confirm the practicality of the proposed method.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"5513-5520"},"PeriodicalIF":4.6,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143888385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RGB-Based Category-Level Object Pose Estimation via Depth Recovery and Adaptive Refinement 基于rgb的基于深度恢复和自适应细化的类别级目标姿态估计
IF 4.6 2区 计算机科学
IEEE Robotics and Automation Letters Pub Date : 2025-04-10 DOI: 10.1109/LRA.2025.3559841
Hui Yang;Wei Sun;Jian Liu;Jin Zheng;Zhiwen Zeng;Ajmal Mian
{"title":"RGB-Based Category-Level Object Pose Estimation via Depth Recovery and Adaptive Refinement","authors":"Hui Yang;Wei Sun;Jian Liu;Jin Zheng;Zhiwen Zeng;Ajmal Mian","doi":"10.1109/LRA.2025.3559841","DOIUrl":"https://doi.org/10.1109/LRA.2025.3559841","url":null,"abstract":"Category-level pose estimation methods have received widespread attention as they can be generalized to intra-class unseen objects. Although RGB-D-based category-level methods have made significant progress, reliance on depth image limits practical application. RGB-based methods offer a more practical and cost-effective solution. However, current RGB-based methods struggle with object geometry perception, leading to inaccurate pose estimation. We propose depth recovery and adaptive refinement for category-level object pose estimation from a single RGB image. We leverage DINOv2 to reconstruct the coarse scene-level depth from the input RGB image and propose an adaptive refinement network based on an encoder-decoder architecture to dynamically improve the predicted coarse depth and reduce its gap from the ground truth. We introduce a 2D–3D consistency loss to ensure correspondence between the point cloud obtained from depth projection and the objects in the 2D image. This consistency supervision enables the model to maintain alignment between the depth image and the point cloud. Finally, we extract features from the refined point cloud and feed them into two confidence-aware rotation regression branches and a translation and size prediction residual branch for end-to-end training. Decoupling the rotation matrix provides a more direct representation, which facilitates parameter optimization and gradient propagation. Extensive experiments on the REAL275 and CAMERA25 datasets demonstrate the superior performance of our method. Real-world estimation and robotic grasping experiments demonstrate our model robustness to occlusion, clutter environments, and low-textured objects. Our code and robotic grasping video are available at DA-Pose.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"5377-5384"},"PeriodicalIF":4.6,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143856250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Cross-Modal Visuomotor Policies for Autonomous Drone Navigation 自主无人机导航的跨模态视觉运动策略学习
IF 4.6 2区 计算机科学
IEEE Robotics and Automation Letters Pub Date : 2025-04-10 DOI: 10.1109/LRA.2025.3559824
Yuhang Zhang;Jiaping Xiao;Mir Feroskhan
{"title":"Learning Cross-Modal Visuomotor Policies for Autonomous Drone Navigation","authors":"Yuhang Zhang;Jiaping Xiao;Mir Feroskhan","doi":"10.1109/LRA.2025.3559824","DOIUrl":"https://doi.org/10.1109/LRA.2025.3559824","url":null,"abstract":"Developing effective vision-based navigation algorithms adapting to various scenarios is a significant challenge for autonomous drone systems, with vast potential in diverse real-world applications. This paper proposes a novel visuomotor policy learning framework for monocular autonomous navigation, combining cross-modal contrastive learning with deep reinforcement learning (DRL) to train a visuomotor policy. Our approach first leverages contrastive learning to extract consistent, task-focused visual representations from high-dimensional RGB images as depth images, and then directly maps these representations to action commands with DRL. This framework enables RGB images to capture structural and spatial information similar to depth images, which remains largely invariant under changes in lighting and texture, thereby maintaining robustness across various environments. We evaluate our approach through simulated and physical experiments, showing that our visuomotor policy outperforms baseline methods in both effectiveness and resilience to unseen visual disturbances. Our findings suggest that the key to enhancing transferability in monocular RGB-based navigation lies in achieving consistent, well-aligned visual representations across scenarios, which is an aspect often lacking in traditional end-to-end approaches.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"5425-5432"},"PeriodicalIF":4.6,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143860761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topo-Field: Topometric Mapping With Brain-Inspired Hierarchical Layout-Object-Position Fields 拓扑场:基于大脑启发的分层布局-对象-位置场的拓扑映射
IF 4.6 2区 计算机科学
IEEE Robotics and Automation Letters Pub Date : 2025-04-10 DOI: 10.1109/LRA.2025.3559836
Jiawei Hou;Wenhao Guan;Longfei Liang;Jianfeng Feng;Xiangyang Xue;Taiping Zeng
{"title":"Topo-Field: Topometric Mapping With Brain-Inspired Hierarchical Layout-Object-Position Fields","authors":"Jiawei Hou;Wenhao Guan;Longfei Liang;Jianfeng Feng;Xiangyang Xue;Taiping Zeng","doi":"10.1109/LRA.2025.3559836","DOIUrl":"https://doi.org/10.1109/LRA.2025.3559836","url":null,"abstract":"Mobile robots require comprehensive scene understanding to operate effectively in diverse environments, enriched with contextual information such as layouts, objects, and their relationships. Although advances like neural radiance fields (NeRFs) offer high-fidelity 3D reconstructions, they are computationally intensive and often lack efficient representations of traversable spaces essential for planning and navigation. In contrast, topological maps are computationally efficient but lack the semantic richness necessary for a more complete understanding of the environment. Inspired by a population code in the postrhinal cortex (POR) strongly tuned to spatial layouts over scene content rapidly forming a high-level cognitive map, this work introduces Topo-Field, a framework that integrates Layout-Object-Position (LOP) associations into a neural field and constructs a topometric map from this learned representation. LOP associations are modeled by explicitly encoding object and layout information, while a Large Foundation Model (LFM) technique allows for efficient training without extensive annotations. The topometric map is then constructed by querying the learned neural representation, offering both semantic richness and computational efficiency. Empirical evaluations in multi-room environments demonstrate the effectiveness of Topo-Field in tasks such as position attribute inference, query localization, and topometric planning, successfully bridging the gap between high-fidelity scene understanding and efficient robotic navigation.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"5385-5392"},"PeriodicalIF":4.6,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143856378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Null Space Compliance Approach for Maintaining Safety and Tracking Performance in Human-Robot Interactions 在人机交互中保持安全和跟踪性能的无效空间遵从方法
IF 4.6 2区 计算机科学
IEEE Robotics and Automation Letters Pub Date : 2025-04-10 DOI: 10.1109/LRA.2025.3559845
Zi-Qi Yang;Miaomiao Wang;Mehrdad R. Kermani
{"title":"A Null Space Compliance Approach for Maintaining Safety and Tracking Performance in Human-Robot Interactions","authors":"Zi-Qi Yang;Miaomiao Wang;Mehrdad R. Kermani","doi":"10.1109/LRA.2025.3559845","DOIUrl":"https://doi.org/10.1109/LRA.2025.3559845","url":null,"abstract":"In recent years, the focus on developing robot manipulators has shifted towards prioritizing safety in Human-Robot Interaction (HRI). Impedance control is a typical approach for interaction control in collaboration tasks. However, such a control approach has two main limitations: 1) the end-effector (EE)’s limited compliance to adapt to unknown physical interactions, and 2) inability of the robot body to compliantly adapt to unknown physical interactions. In this work, we present an approach to address these drawbacks. We introduce a modified Cartesian impedance control method combined with a Dynamical System (DS)-based motion generator, aimed at enhancing the interaction capability of the EE without compromising main task tracking performance. This approach enables human coworkers to interact with the EE on-the-fly, e.g. tool changeover, after which the robot compliantly resumes its task. Additionally, combining with a new null space impedance control method enables the robot body to exhibit compliant behaviour in response to interactions, avoiding serious injuries from accidental contact while mitigating the impact on main task tracking performance. Finally, we prove the passivity of the system and validate the proposed approach through comprehensive comparative experiments on a 7 Degree-of-Freedom (DOF) KUKA LWR IV+ robot.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"5369-5376"},"PeriodicalIF":4.6,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143856201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VoxNeRF: Bridging Voxel Representation and Neural Radiance Fields for Enhanced Indoor View Synthesis VoxNeRF:桥接体素表示和增强室内视图合成的神经辐射场
IF 4.6 2区 计算机科学
IEEE Robotics and Automation Letters Pub Date : 2025-04-10 DOI: 10.1109/LRA.2025.3559844
Sen Wang;Qing Cheng;Stefano Gasperini;Wei Zhang;Shun-Cheng Wu;Niclas Zeller;Daniel Cremers;Nassir Navab
{"title":"VoxNeRF: Bridging Voxel Representation and Neural Radiance Fields for Enhanced Indoor View Synthesis","authors":"Sen Wang;Qing Cheng;Stefano Gasperini;Wei Zhang;Shun-Cheng Wu;Niclas Zeller;Daniel Cremers;Nassir Navab","doi":"10.1109/LRA.2025.3559844","DOIUrl":"https://doi.org/10.1109/LRA.2025.3559844","url":null,"abstract":"The generation of high-fidelity view synthesis is essential for robotic navigation and interaction but remains challenging, particularly in indoor environments and real-time scenarios. Existing techniques often require significant computational resources for both training and rendering, and they frequently result in suboptimal 3D representations due to insufficient geometric structuring. To address these limitations, we introduce VoxNeRF, a novel approach that utilizes easy-to-obtain geometry priors to enhance both the quality and efficiency of neural indoor reconstruction and novel view synthesis. We propose an efficient voxel-guided sampling technique that allocates computational resources selectively to the most relevant segments of rays based on a voxel-encoded geometry prior, significantly reducing training and rendering time. Additionally, we incorporate a robust depth loss to improve reconstruction and rendering quality in sparse view settings. Our approach is validated with extensive experiments on ScanNet and ScanNet++ where VoxNeRF outperforms existing state-of-the-art methods and establishes a new benchmark for indoor immersive interpolation and extrapolation settings.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"5903-5910"},"PeriodicalIF":4.6,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10960747","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143902727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信