{"title":"MMSeg: A multimodal multi-scale point cloud segmentation model for navigable areas in complex field environments","authors":"Yifang Huang, Hongdou He, Peng Shi, Xiaobing Hao, Haitao He, Pei Miao","doi":"10.1016/j.robot.2025.105229","DOIUrl":null,"url":null,"abstract":"<div><div>This study focuses on advanced navigable area perception technology to address critical applications such as battlefield support and emergency rescue for autonomous ground intelligent agents. It emphasizes its application in complex field environments characterized by unstructured, diverse, and intricate features. Current methods predominantly target structured environments, neglecting the unique challenges presented by unstructured terrains essential for critical applications such as battlefield support and emergency rescue missions. To address this gap, we propose three key contributions through a Multimodal Multi-scale point cloud Segmentation (MMSeg) model. First, we introduce a multimodal ground feature fusion technique that integrates geometric information from LiDAR point clouds with visual texture features from images, enhancing the recognition capabilities of heterogeneous ground surfaces. Second, we propose a local–global terrain geometry information enhancement method that utilizes a dual-attention mechanism to effectively capture and analyze both local and global geometric features in complex terrain conditions. Third, we design a multi-scale classifier framework that effectively processes the multimodal fused information of ground materials and terrain structures, enabling precise segmentation of navigable areas. An experiment on a dedicated platform demonstrates that the MMSeg model achieves mIoU 6% higher than commonly used point cloud segmentation models. These findings suggest that the MMSeg model significantly enhances the perception capabilities of autonomous ground intelligent agents in challenging environments, providing a promising and novel solution to improve their operational effectiveness in complex field conditions.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"195 ","pages":"Article 105229"},"PeriodicalIF":5.2000,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Autonomous Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0921889025003264","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
This study focuses on advanced navigable area perception technology to address critical applications such as battlefield support and emergency rescue for autonomous ground intelligent agents. It emphasizes its application in complex field environments characterized by unstructured, diverse, and intricate features. Current methods predominantly target structured environments, neglecting the unique challenges presented by unstructured terrains essential for critical applications such as battlefield support and emergency rescue missions. To address this gap, we propose three key contributions through a Multimodal Multi-scale point cloud Segmentation (MMSeg) model. First, we introduce a multimodal ground feature fusion technique that integrates geometric information from LiDAR point clouds with visual texture features from images, enhancing the recognition capabilities of heterogeneous ground surfaces. Second, we propose a local–global terrain geometry information enhancement method that utilizes a dual-attention mechanism to effectively capture and analyze both local and global geometric features in complex terrain conditions. Third, we design a multi-scale classifier framework that effectively processes the multimodal fused information of ground materials and terrain structures, enabling precise segmentation of navigable areas. An experiment on a dedicated platform demonstrates that the MMSeg model achieves mIoU 6% higher than commonly used point cloud segmentation models. These findings suggest that the MMSeg model significantly enhances the perception capabilities of autonomous ground intelligent agents in challenging environments, providing a promising and novel solution to improve their operational effectiveness in complex field conditions.
期刊介绍:
Robotics and Autonomous Systems will carry articles describing fundamental developments in the field of robotics, with special emphasis on autonomous systems. An important goal of this journal is to extend the state of the art in both symbolic and sensory based robot control and learning in the context of autonomous systems.
Robotics and Autonomous Systems will carry articles on the theoretical, computational and experimental aspects of autonomous systems, or modules of such systems.