LSW-Net：一种基于可学习球窗多模态特征融合的复杂荒野场景分割技术

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing Pub Date : 2025-07-03 DOI:10.1016/j.dsp.2025.105440

Hongdou He , Pei Miao , Yifang Huang , Peng Shi , Xiaobing Hao , Guoyan Huang , Bowen Zhao

{"title":"LSW-Net：一种基于可学习球窗多模态特征融合的复杂荒野场景分割技术","authors":"Hongdou He , Pei Miao , Yifang Huang , Peng Shi , Xiaobing Hao , Guoyan Huang , Bowen Zhao","doi":"10.1016/j.dsp.2025.105440","DOIUrl":null,"url":null,"abstract":"<div><div>Wilderness scenarios are characterized by their unstructured and complex diversity, which makes segmentation in such environments more challenging. Current research mainly focuses on perception methods for structured environments (such as urban roads), with relatively less attention given to unstructured wilderness scenarios. Therefore, this paper investigates segmentation techniques for complex wilderness scenarios. Firstly, we propose a multi-channel point cloud mapping method specifically designed for wilderness environments, which extracts both geometric distribution features of terrain structures and ground texture characteristics directly from 3D point cloud data. Secondly, we propose a learnable spherical window mechanism for multi-modal feature fusion, enabling geometric-aware cross-modal interaction. By constructing point cloud spherical windows, the most relevant image context features to the point cloud features are filtered out, enabling the registration and fusion of complementary multi-modal features. Finally, a multi-head fusion classifier is employed to achieve effective segmentation of complex wilderness scenarios under multi-modal data fusion. An experimental platform for ground unmanned intelligent agent perception technology research was built, and the proposed model was subjected to simulation and experimental analysis. The results show that the model has high segmentation accuracy, with mIoU precision reaching 82.01% in simulated environments and 79.30% in real environments, representing an improvement of 7.63% and 5.99% respectively over traditional methods. This model is suitable for segmentation tasks in complex wilderness scenarios, providing a new solution to enhance the perception capabilities of wilderness scenarios.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"167 ","pages":"Article 105440"},"PeriodicalIF":2.9000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LSW-Net: A complex wilderness scenarios segmentation technique based on learnable spherical window multi-modal feature fusion\",\"authors\":\"Hongdou He , Pei Miao , Yifang Huang , Peng Shi , Xiaobing Hao , Guoyan Huang , Bowen Zhao\",\"doi\":\"10.1016/j.dsp.2025.105440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Wilderness scenarios are characterized by their unstructured and complex diversity, which makes segmentation in such environments more challenging. Current research mainly focuses on perception methods for structured environments (such as urban roads), with relatively less attention given to unstructured wilderness scenarios. Therefore, this paper investigates segmentation techniques for complex wilderness scenarios. Firstly, we propose a multi-channel point cloud mapping method specifically designed for wilderness environments, which extracts both geometric distribution features of terrain structures and ground texture characteristics directly from 3D point cloud data. Secondly, we propose a learnable spherical window mechanism for multi-modal feature fusion, enabling geometric-aware cross-modal interaction. By constructing point cloud spherical windows, the most relevant image context features to the point cloud features are filtered out, enabling the registration and fusion of complementary multi-modal features. Finally, a multi-head fusion classifier is employed to achieve effective segmentation of complex wilderness scenarios under multi-modal data fusion. An experimental platform for ground unmanned intelligent agent perception technology research was built, and the proposed model was subjected to simulation and experimental analysis. The results show that the model has high segmentation accuracy, with mIoU precision reaching 82.01% in simulated environments and 79.30% in real environments, representing an improvement of 7.63% and 5.99% respectively over traditional methods. This model is suitable for segmentation tasks in complex wilderness scenarios, providing a new solution to enhance the perception capabilities of wilderness scenarios.</div></div>\",\"PeriodicalId\":51011,\"journal\":{\"name\":\"Digital Signal Processing\",\"volume\":\"167 \",\"pages\":\"Article 105440\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1051200425004622\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425004622","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

荒野场景的特点是其非结构化和复杂的多样性，这使得在这样的环境中进行分割更具挑战性。目前的研究主要集中在结构化环境（如城市道路）的感知方法上，对非结构化荒野场景的关注相对较少。因此，本文研究了复杂荒野场景的分割技术。首先，我们提出了一种针对荒野环境的多通道点云映射方法，直接从三维点云数据中提取地形结构的几何分布特征和地面纹理特征。其次，我们提出了一种可学习的多模态特征融合球面窗口机制，实现几何感知的跨模态交互。通过构建点云球面窗口，滤除与点云特征最相关的图像上下文特征，实现互补多模态特征的配准和融合。最后，采用多头融合分类器实现多模态数据融合下复杂荒野场景的有效分割。搭建了地面无人智能体感知技术研究实验平台，并对所提模型进行了仿真和实验分析。结果表明，该模型具有较高的分割精度，模拟环境下mIoU精度达到82.01%，真实环境下mIoU精度达到79.30%，分别比传统方法提高了7.63%和5.99%。该模型适用于复杂荒野场景的分割任务，为增强荒野场景感知能力提供了一种新的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

LSW-Net: A complex wilderness scenarios segmentation technique based on learnable spherical window multi-modal feature fusion

查看原文本刊更多论文

LSW-Net: A complex wilderness scenarios segmentation technique based on learnable spherical window multi-modal feature fusion

Wilderness scenarios are characterized by their unstructured and complex diversity, which makes segmentation in such environments more challenging. Current research mainly focuses on perception methods for structured environments (such as urban roads), with relatively less attention given to unstructured wilderness scenarios. Therefore, this paper investigates segmentation techniques for complex wilderness scenarios. Firstly, we propose a multi-channel point cloud mapping method specifically designed for wilderness environments, which extracts both geometric distribution features of terrain structures and ground texture characteristics directly from 3D point cloud data. Secondly, we propose a learnable spherical window mechanism for multi-modal feature fusion, enabling geometric-aware cross-modal interaction. By constructing point cloud spherical windows, the most relevant image context features to the point cloud features are filtered out, enabling the registration and fusion of complementary multi-modal features. Finally, a multi-head fusion classifier is employed to achieve effective segmentation of complex wilderness scenarios under multi-modal data fusion. An experimental platform for ground unmanned intelligent agent perception technology research was built, and the proposed model was subjected to simulation and experimental analysis. The results show that the model has high segmentation accuracy, with mIoU precision reaching 82.01% in simulated environments and 79.30% in real environments, representing an improvement of 7.63% and 5.99% respectively over traditional methods. This model is suitable for segmentation tasks in complex wilderness scenarios, providing a new solution to enhance the perception capabilities of wilderness scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Digital Signal Processing 工程技术-工程：电子与电气

CiteScore

5.30

自引率

17.20%

发文量

435

审稿时长

66 days

期刊介绍： Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,