Hongdou He , Pei Miao , Yifang Huang , Peng Shi , Xiaobing Hao , Guoyan Huang , Bowen Zhao
{"title":"LSW-Net:一种基于可学习球窗多模态特征融合的复杂荒野场景分割技术","authors":"Hongdou He , Pei Miao , Yifang Huang , Peng Shi , Xiaobing Hao , Guoyan Huang , Bowen Zhao","doi":"10.1016/j.dsp.2025.105440","DOIUrl":null,"url":null,"abstract":"<div><div>Wilderness scenarios are characterized by their unstructured and complex diversity, which makes segmentation in such environments more challenging. Current research mainly focuses on perception methods for structured environments (such as urban roads), with relatively less attention given to unstructured wilderness scenarios. Therefore, this paper investigates segmentation techniques for complex wilderness scenarios. Firstly, we propose a multi-channel point cloud mapping method specifically designed for wilderness environments, which extracts both geometric distribution features of terrain structures and ground texture characteristics directly from 3D point cloud data. Secondly, we propose a learnable spherical window mechanism for multi-modal feature fusion, enabling geometric-aware cross-modal interaction. By constructing point cloud spherical windows, the most relevant image context features to the point cloud features are filtered out, enabling the registration and fusion of complementary multi-modal features. Finally, a multi-head fusion classifier is employed to achieve effective segmentation of complex wilderness scenarios under multi-modal data fusion. An experimental platform for ground unmanned intelligent agent perception technology research was built, and the proposed model was subjected to simulation and experimental analysis. The results show that the model has high segmentation accuracy, with mIoU precision reaching 82.01% in simulated environments and 79.30% in real environments, representing an improvement of 7.63% and 5.99% respectively over traditional methods. This model is suitable for segmentation tasks in complex wilderness scenarios, providing a new solution to enhance the perception capabilities of wilderness scenarios.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"167 ","pages":"Article 105440"},"PeriodicalIF":2.9000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LSW-Net: A complex wilderness scenarios segmentation technique based on learnable spherical window multi-modal feature fusion\",\"authors\":\"Hongdou He , Pei Miao , Yifang Huang , Peng Shi , Xiaobing Hao , Guoyan Huang , Bowen Zhao\",\"doi\":\"10.1016/j.dsp.2025.105440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Wilderness scenarios are characterized by their unstructured and complex diversity, which makes segmentation in such environments more challenging. Current research mainly focuses on perception methods for structured environments (such as urban roads), with relatively less attention given to unstructured wilderness scenarios. Therefore, this paper investigates segmentation techniques for complex wilderness scenarios. Firstly, we propose a multi-channel point cloud mapping method specifically designed for wilderness environments, which extracts both geometric distribution features of terrain structures and ground texture characteristics directly from 3D point cloud data. Secondly, we propose a learnable spherical window mechanism for multi-modal feature fusion, enabling geometric-aware cross-modal interaction. By constructing point cloud spherical windows, the most relevant image context features to the point cloud features are filtered out, enabling the registration and fusion of complementary multi-modal features. Finally, a multi-head fusion classifier is employed to achieve effective segmentation of complex wilderness scenarios under multi-modal data fusion. An experimental platform for ground unmanned intelligent agent perception technology research was built, and the proposed model was subjected to simulation and experimental analysis. The results show that the model has high segmentation accuracy, with mIoU precision reaching 82.01% in simulated environments and 79.30% in real environments, representing an improvement of 7.63% and 5.99% respectively over traditional methods. This model is suitable for segmentation tasks in complex wilderness scenarios, providing a new solution to enhance the perception capabilities of wilderness scenarios.</div></div>\",\"PeriodicalId\":51011,\"journal\":{\"name\":\"Digital Signal Processing\",\"volume\":\"167 \",\"pages\":\"Article 105440\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1051200425004622\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425004622","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
LSW-Net: A complex wilderness scenarios segmentation technique based on learnable spherical window multi-modal feature fusion
Wilderness scenarios are characterized by their unstructured and complex diversity, which makes segmentation in such environments more challenging. Current research mainly focuses on perception methods for structured environments (such as urban roads), with relatively less attention given to unstructured wilderness scenarios. Therefore, this paper investigates segmentation techniques for complex wilderness scenarios. Firstly, we propose a multi-channel point cloud mapping method specifically designed for wilderness environments, which extracts both geometric distribution features of terrain structures and ground texture characteristics directly from 3D point cloud data. Secondly, we propose a learnable spherical window mechanism for multi-modal feature fusion, enabling geometric-aware cross-modal interaction. By constructing point cloud spherical windows, the most relevant image context features to the point cloud features are filtered out, enabling the registration and fusion of complementary multi-modal features. Finally, a multi-head fusion classifier is employed to achieve effective segmentation of complex wilderness scenarios under multi-modal data fusion. An experimental platform for ground unmanned intelligent agent perception technology research was built, and the proposed model was subjected to simulation and experimental analysis. The results show that the model has high segmentation accuracy, with mIoU precision reaching 82.01% in simulated environments and 79.30% in real environments, representing an improvement of 7.63% and 5.99% respectively over traditional methods. This model is suitable for segmentation tasks in complex wilderness scenarios, providing a new solution to enhance the perception capabilities of wilderness scenarios.
期刊介绍:
Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal.
The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as:
• big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,