Zhenzhong Huang , Hongjuan Shao , Chao Ren , Hongman Li , Haoming Bai , Zhou Lei , Gu Yao , Qinyi Chen
{"title":"trimnet:用于遥感道路提取的Trinityformer-Mamba融合","authors":"Zhenzhong Huang , Hongjuan Shao , Chao Ren , Hongman Li , Haoming Bai , Zhou Lei , Gu Yao , Qinyi Chen","doi":"10.1016/j.ejrs.2025.07.006","DOIUrl":null,"url":null,"abstract":"<div><div>Precise road information extraction is crucial for transportation and intelligent sensing. Recently, the fusion of CNN and Transformer architectures in remote sensing-based road extraction, along with U-shaped semantic segmentation networks, has gained significant attention. However, existing methods rely heavily on global features while overlooking local details, limiting accuracy in complex road scenarios. To address this, we propose Trinityformer-Mamba Network (TriM-Net) to enhance local feature extraction. TriM-Net adopts Trinityformer, a modified Transformer architecture. This architecture optimizes local feature perception and reduces computational overhead by replacing the traditional softmax with an improved self-attention mechanism and a novel normalization method. The feedforward network employs a Kolmogorov-Arnold network (KAN), reducing neuron count while enhancing local detail capture using edge activation functions and the Arnold transform. Additionally, the normalization layer integrates the benefits of BatchNorm and LayerNorm for better performance. Furthermore, TriM-Net incorporates an MT_block built with stacked Mamba networks. By leveraging their internal CausalConv1D and SSM modules, this block enhances modeling and local perception while effectively merging Transformer and CNN information for improved image reconstruction. Experimental results demonstrate TriM-Net’s significant superiority over existing state-of-the-art models. On the LSRV dataset, it outperformed the second-best model with advantages of 2.17% in Precision, 0.34% in Recall, 1.72% in IoU, and 2.09% in F1-score. Similarly, on the Massachusetts Road Dataset, it achieved superior Recall (0.45%), IoU (1.41%), and F1-score (1.07%) over its closest competitor. These substantial improvements highlight TriM-Net’s outstanding performance in road information extraction.</div></div>","PeriodicalId":48539,"journal":{"name":"Egyptian Journal of Remote Sensing and Space Sciences","volume":"28 3","pages":"Pages 523-533"},"PeriodicalIF":4.1000,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TriM-Net: Trinityformer-Mamba fusion for road extraction in remote sensing\",\"authors\":\"Zhenzhong Huang , Hongjuan Shao , Chao Ren , Hongman Li , Haoming Bai , Zhou Lei , Gu Yao , Qinyi Chen\",\"doi\":\"10.1016/j.ejrs.2025.07.006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Precise road information extraction is crucial for transportation and intelligent sensing. Recently, the fusion of CNN and Transformer architectures in remote sensing-based road extraction, along with U-shaped semantic segmentation networks, has gained significant attention. However, existing methods rely heavily on global features while overlooking local details, limiting accuracy in complex road scenarios. To address this, we propose Trinityformer-Mamba Network (TriM-Net) to enhance local feature extraction. TriM-Net adopts Trinityformer, a modified Transformer architecture. This architecture optimizes local feature perception and reduces computational overhead by replacing the traditional softmax with an improved self-attention mechanism and a novel normalization method. The feedforward network employs a Kolmogorov-Arnold network (KAN), reducing neuron count while enhancing local detail capture using edge activation functions and the Arnold transform. Additionally, the normalization layer integrates the benefits of BatchNorm and LayerNorm for better performance. Furthermore, TriM-Net incorporates an MT_block built with stacked Mamba networks. By leveraging their internal CausalConv1D and SSM modules, this block enhances modeling and local perception while effectively merging Transformer and CNN information for improved image reconstruction. Experimental results demonstrate TriM-Net’s significant superiority over existing state-of-the-art models. On the LSRV dataset, it outperformed the second-best model with advantages of 2.17% in Precision, 0.34% in Recall, 1.72% in IoU, and 2.09% in F1-score. Similarly, on the Massachusetts Road Dataset, it achieved superior Recall (0.45%), IoU (1.41%), and F1-score (1.07%) over its closest competitor. These substantial improvements highlight TriM-Net’s outstanding performance in road information extraction.</div></div>\",\"PeriodicalId\":48539,\"journal\":{\"name\":\"Egyptian Journal of Remote Sensing and Space Sciences\",\"volume\":\"28 3\",\"pages\":\"Pages 523-533\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-08-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Egyptian Journal of Remote Sensing and Space Sciences\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1110982325000456\",\"RegionNum\":3,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Journal of Remote Sensing and Space Sciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110982325000456","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
TriM-Net: Trinityformer-Mamba fusion for road extraction in remote sensing
Precise road information extraction is crucial for transportation and intelligent sensing. Recently, the fusion of CNN and Transformer architectures in remote sensing-based road extraction, along with U-shaped semantic segmentation networks, has gained significant attention. However, existing methods rely heavily on global features while overlooking local details, limiting accuracy in complex road scenarios. To address this, we propose Trinityformer-Mamba Network (TriM-Net) to enhance local feature extraction. TriM-Net adopts Trinityformer, a modified Transformer architecture. This architecture optimizes local feature perception and reduces computational overhead by replacing the traditional softmax with an improved self-attention mechanism and a novel normalization method. The feedforward network employs a Kolmogorov-Arnold network (KAN), reducing neuron count while enhancing local detail capture using edge activation functions and the Arnold transform. Additionally, the normalization layer integrates the benefits of BatchNorm and LayerNorm for better performance. Furthermore, TriM-Net incorporates an MT_block built with stacked Mamba networks. By leveraging their internal CausalConv1D and SSM modules, this block enhances modeling and local perception while effectively merging Transformer and CNN information for improved image reconstruction. Experimental results demonstrate TriM-Net’s significant superiority over existing state-of-the-art models. On the LSRV dataset, it outperformed the second-best model with advantages of 2.17% in Precision, 0.34% in Recall, 1.72% in IoU, and 2.09% in F1-score. Similarly, on the Massachusetts Road Dataset, it achieved superior Recall (0.45%), IoU (1.41%), and F1-score (1.07%) over its closest competitor. These substantial improvements highlight TriM-Net’s outstanding performance in road information extraction.
期刊介绍:
The Egyptian Journal of Remote Sensing and Space Sciences (EJRS) encompasses a comprehensive range of topics within Remote Sensing, Geographic Information Systems (GIS), planetary geology, and space technology development, including theories, applications, and modeling. EJRS aims to disseminate high-quality, peer-reviewed research focusing on the advancement of remote sensing and GIS technologies and their practical applications for effective planning, sustainable development, and environmental resource conservation. The journal particularly welcomes innovative papers with broad scientific appeal.