{"title":"A Fast Point Cloud Ground Segmentation Approach Based on Block-Sparsely Connected Coarse-to-Fine Markov Random Field","authors":"Weixin Huang;Linglong Lin;Shaobo Wang;Zhun Fan;Biao Yu;Jiajia Chen","doi":"10.1109/LRA.2025.3546071","DOIUrl":"https://doi.org/10.1109/LRA.2025.3546071","url":null,"abstract":"Ground segmentation is an essential preprocessing task for autonomous vehicles with 3D LiDARs. Nevertheless, current methods for ground segmentation fall short of achieving optimal performance, primarily hindered by under-segmentation, over-segmentation, slow-segmentation, and poor adaptability. This letter proposes a fast block-sparsely connected coarse-to-fine Markov Random Field (MRF) point cloud ground segmentation approach to address the above challenges. It starts with a ring-shaped elevation continuity map for non-ground segmentation, followed by a range image-based algorithm to separate high-confidence and uncertain points to complete the coarse segmentation. Finally, it uses a block-sparsely connected MRF construct method to organize the point cloud and employs the graph cut method to solve the MRFs for fine segmentation in parallel. Comparison with other state-of-the-art methods on the SemanticKITTI dataset demonstrates that the proposed algorithm achieves the highest accuracy among non-deep learning methods. Experiments on the 32-beam and 128-beam datasets demonstrate its advantages in terms of generalization capability. Additionally, our method processes Velodyne HDL-64E data frames in real-time (10.33 ms) on an Intel i9-11900 K CPU, which is significantly faster than other MRF-based methods.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 4","pages":"3843-3850"},"PeriodicalIF":4.6,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143621678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
James R. Han;Hugues Thomas;Jian Zhang;Nicholas Rhinehart;Timothy D. Barfoot
{"title":"DR-MPC: Deep Residual Model Predictive Control for Real-World Social Navigation","authors":"James R. Han;Hugues Thomas;Jian Zhang;Nicholas Rhinehart;Timothy D. Barfoot","doi":"10.1109/LRA.2025.3546106","DOIUrl":"https://doi.org/10.1109/LRA.2025.3546106","url":null,"abstract":"How can a robot safely navigate around people with complex motion patterns? Deep Reinforcement Learning (DRL) in simulation holds some promise, but much prior work relies on simulators that fail to capture the nuances of real human motion. Thus, we propose Deep Residual Model Predictive Control (DR-MPC) to enable robots to quickly and safely perform DRL from real-world crowd navigation data. By blending MPC with model-free DRL, DR-MPC overcomes the DRL challenges of large data requirements and unsafe initial behavior. DR-MPC is initialized with MPC-based path tracking, and gradually learns to interact more effectively with humans. To further accelerate learning, a safety component estimates out-of-distribution states to guide the robot away from likely collisions. In simulation, we show that DR-MPC substantially outperforms prior work, including traditional DRL and residual DRL models. Hardware experiments show our approach successfully enables a robot to navigate a variety of crowded situations with few errors using less than 4 hours of training data.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 4","pages":"4029-4036"},"PeriodicalIF":4.6,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143645288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingrui Li;Zhetao Guo;Tianchen Deng;Yiming Zhou;Yuxiang Ren;Hongyu Wang
{"title":"DDN-SLAM: Real Time Dense Dynamic Neural Implicit SLAM","authors":"Mingrui Li;Zhetao Guo;Tianchen Deng;Yiming Zhou;Yuxiang Ren;Hongyu Wang","doi":"10.1109/LRA.2025.3546130","DOIUrl":"https://doi.org/10.1109/LRA.2025.3546130","url":null,"abstract":"SLAM systems based on NeRF have demonstrated superior performance in rendering quality and scene reconstruction for static environments compared to traditional dense SLAM. However, they encounter tracking drift and mapping errors in real-world scenarios with dynamic interferences. To address these issues, we propose DDN-SLAM, a real-time dense dynamic neural implicit SLAM system integrating semantic features. To address dynamic tracking interferences, we propose a feature point segmentation method that combines semantic features with a mixed Gaussian distribution model. To avoid incorrect background removal, we propose a mapping strategy based on sparse point cloud sampling and background restoration. We propose a dynamic semantic loss to eliminate dynamic occlusions. Experimental results demonstrate that DDN-SLAM is capable of robustly tracking and producing high-quality reconstructions in dynamic environments, while appropriately preserving potential dynamic objects. Compared to existing neural implicit SLAM systems, the tracking results on dynamic datasets indicate an average 90% improvement in Average Trajectory Error (ATE) accuracy.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 5","pages":"4300-4307"},"PeriodicalIF":4.6,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143688172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Robotics and Automation Society Information","authors":"","doi":"10.1109/LRA.2025.3543709","DOIUrl":"https://doi.org/10.1109/LRA.2025.3543709","url":null,"abstract":"","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 3","pages":"C3-C3"},"PeriodicalIF":4.6,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10903211","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143512927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Robotics and Automation Letters Information for Authors","authors":"","doi":"10.1109/LRA.2025.3543711","DOIUrl":"https://doi.org/10.1109/LRA.2025.3543711","url":null,"abstract":"","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 3","pages":"C4-C4"},"PeriodicalIF":4.6,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10903547","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143512926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Robotics and Automation Society Information","authors":"","doi":"10.1109/LRA.2025.3543707","DOIUrl":"https://doi.org/10.1109/LRA.2025.3543707","url":null,"abstract":"","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 3","pages":"C2-C2"},"PeriodicalIF":4.6,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10903152","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maxime Chareyre;Pierre Fournier;Julien Moras;Jean-Marc Bourinet;Youcef Mezouar
{"title":"Unsupervised Discovery of Objects Physical Properties Through Maximum Entropy Reinforcement Learning","authors":"Maxime Chareyre;Pierre Fournier;Julien Moras;Jean-Marc Bourinet;Youcef Mezouar","doi":"10.1109/LRA.2025.3545382","DOIUrl":"https://doi.org/10.1109/LRA.2025.3545382","url":null,"abstract":"Understanding the environment is crucial for autonomous robots to perform navigation and manipulation tasks. Never-seen-before objects may have complex appearances and dynamics, where only physical interactions can help to identify visually hidden properties like mass or friction. In this work we propose a baseline for the newly defined problem of using physical interactions to discover unknown properties of objects, without prior knowledge of them or any supervision. The agent first uses intrinsically motivated unsupervised reinforcement learning to learn how to interact with objects, so as to get observations with a level of information which eases the physical properties estimation. A self-supervised predictive task is then set up while following the learned behaviour to extract a latent representation of the physical properties of an object. When applied to a simulated mobile robot in presence of varying objects, the proposed baseline identifies and differentiates categorical properties, e.g. shape, and quantifies continuous properties, e.g. mass and friction, with excellent correlations to their true values even from noisy observations. It achieves significantly better results than simple interactions of a policy that performs poor exploration. This work provides an implementation of a functional, object-oriented action-perception cycle for embodied robotic agents.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 4","pages":"3723-3730"},"PeriodicalIF":4.6,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143601910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"W-ControlUDA: Weather-Controllable Diffusion-assisted Unsupervised Domain Adaptation for Semantic Segmentation","authors":"Fengyi Shen;Li Zhou;Kagan Kuecuekaytekin;George Basem Fouad Eskandar;Ziyuan Liu;He Wang;Alois Knoll","doi":"10.1109/LRA.2025.3544925","DOIUrl":"https://doi.org/10.1109/LRA.2025.3544925","url":null,"abstract":"Image generation has emerged as a potent strategy to enrich training data for unsupervised domain adaptation (UDA) of semantic segmentation in adverse weathers due to the scarcity of labelled target domain data. Previous UDA works commonly utilize generative adversarial networks (GANs) to translate images from the source to the target domain to enhance UDA training. However, these GANs, trained from scratch in an unpaired manner, produce sub-optimal image quality and lack multi-weather controllability. Consequently, controllable data generation for diverse weather scenarios remains underexplored. The recent strides in text-to-image diffusion models (DM) enables high fidelity diverse image generation conditioned on semantic labels. However, such DMs must be trained in a paired manner, i.e., image and label pairs, which poses huge challenge to the UDA setting where target domain labels are missing. This work addresses two key questions: <italic>What is an optimal approach to train DMs for UDA, and how can the generated data best enhance UDA performance?</i> We introduce W-ControlUDA, a diffusion-assisted framework for UDA segmentation in adverse weather. W-ControlUDA involves two steps: DM training for data augmentation and UDA training using the generated data. Unlike previous unpaired training, our method conditions the DM on target predictions from a pre-trained segmentor, addressing the lack of target labels. We propose UDAControlNet for high-fidelity cross-domain and intra-domain data generation under adverse weathers. In UDA training, a label filtering mechanism is introduced to ensure more reliable results. W-ControlUDA helps UDA achieve a new milestone (72.8 mIoU) on the popular Cityscapes-to-ACDC benchmark and notably improves the model's generalization on 5 other benchmarks.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 5","pages":"4204-4211"},"PeriodicalIF":4.6,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10900417","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DRL-DCLP: A Deep Reinforcement Learning-Based Dimension-Configurable Local Planner for Robot Navigation","authors":"Wei Zhang;Shanze Wang;Mingao Tan;Zhibo Yang;Xianghui Wang;Xiaoyu Shen","doi":"10.1109/LRA.2025.3544927","DOIUrl":"https://doi.org/10.1109/LRA.2025.3544927","url":null,"abstract":"In this letter, we present a deep reinforcement learning-based dimension-configurable local planner (DRL-DCLP) for solving robot navigation problems. DRL-DCLP is the first neural-network local planner capable of handling rectangular differential-drive robots with varying dimension configurations without requiring post-fine-tuning. While DRL has shown excellent performance in enabling robots to navigate complex environments, it faces a significant limitation compared to conventional local planners: dimension-specificity. This constraint implies that a trained controller for a specific configuration cannot be generalized to robots with different physical dimensions, velocity ranges, or acceleration limits. To overcome this limitation, we introduce a dimension-configurable input representation and a novel learning curriculum for training the navigation agent. Extensive experiments demonstrate that DRL-DCLP facilitates successful navigation for robots with diverse dimensional configurations, achieving superior performance across various navigation tasks.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 4","pages":"3636-3643"},"PeriodicalIF":4.6,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TinyVLA: Toward Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation","authors":"Junjie Wen;Yichen Zhu;Jinming Li;Minjie Zhu;Zhibin Tang;Kun Wu;Zhiyuan Xu;Ning Liu;Ran Cheng;Chaomin Shen;Yaxin Peng;Feifei Feng;Jian Tang","doi":"10.1109/LRA.2025.3544909","DOIUrl":"https://doi.org/10.1109/LRA.2025.3544909","url":null,"abstract":"Vision-Language-Action (VLA) models have shown remarkable potential in visuomotor control and instruction comprehension through end-to-end learning processes. However, current VLA models face significant challenges: they are slow during inference and require extensive pre-training on large amounts of robotic data, making real-world deployment difficult. In this letter, we introduce a new family of compact vision-language-action models, called TinyVLA, which offers two key advantages over existing VLA models: (1) faster inference speeds, and (2) improved data efficiency, eliminating the need for pre-training stage. Our framework incorporates two essential components to build TinyVLA: (1) initializing the policy backbone with robust, high-speed multimodal models, and (2) integrating a diffusion policy decoder during fine-tuning to enable precise robot actions. We conducted extensive evaluations of TinyVLA in both simulation and on real robots, demonstrating that our approach significantly outperforms the state-of-the-art VLA model, OpenVLA, in terms of speed and data efficiency, while delivering comparable or superior performance. Additionally, TinyVLA exhibits strong generalization capabilities across various dimensions, including language instructions, novel objects, unseen positions, changes in object appearance, background variations, and environmental shifts, often matching or exceeding the performance of OpenVLA. We believe that TinyVLA offers an interesting perspective on utilizing pre-trained multimodal models for policy learning.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 4","pages":"3988-3995"},"PeriodicalIF":4.6,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143645297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}