Lianyin Jia , Aoxiang Gao , Mengjuan Li , Xiaodong Fu , Haihe Zhou , Jiaman Ding
{"title":"希尔伯特曲线增强的Mamba实时语义分割","authors":"Lianyin Jia , Aoxiang Gao , Mengjuan Li , Xiaodong Fu , Haihe Zhou , Jiaman Ding","doi":"10.1016/j.patcog.2025.112457","DOIUrl":null,"url":null,"abstract":"<div><div>Semantic segmentation is a core technology for vehicle perception of the surrounding environment in autonomous driving. However, existing real-time semantic segmentation models face two major challenges: loss of local detail information and inconsistency of intra-class semantic information. To address these issues, we propose a novel network architecture, HMSNet. The network mainly consists of the following three core modules: the Hilbert curve enhanced Visual Mamba Block (HVM Block), Selective Attention Fusion Module (SAFM), and Multi-scale Context-Aware Module (MCAM). The HVM Block utilizes the Hilbert curve to reduce the dimensionality of two-dimensional images and applies a selective scanning algorithm in Mamba, enabling the network to effectively capture local dependencies while maintaining a global receptive field, thereby optimizing the consistency of intra-class semantic information. The SAFM module effectively merges local detail information from shallow networks with global semantic information from deep networks, alleviating the problem of local detail information loss. Finally, the MCAM module, introduced at the end of the network, enhances the model,s ability to judge contextual information, thereby improving segmentation accuracy. Experimental results show that HMSNet achieves an excellent balance between segmentation accuracy and inference speed on challenging public datasets, including CamVid, Cityscapes, and ADE20K.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112457"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HMSNet: Hilbert curve enhanced Mamba for real-time semantic segmentation\",\"authors\":\"Lianyin Jia , Aoxiang Gao , Mengjuan Li , Xiaodong Fu , Haihe Zhou , Jiaman Ding\",\"doi\":\"10.1016/j.patcog.2025.112457\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Semantic segmentation is a core technology for vehicle perception of the surrounding environment in autonomous driving. However, existing real-time semantic segmentation models face two major challenges: loss of local detail information and inconsistency of intra-class semantic information. To address these issues, we propose a novel network architecture, HMSNet. The network mainly consists of the following three core modules: the Hilbert curve enhanced Visual Mamba Block (HVM Block), Selective Attention Fusion Module (SAFM), and Multi-scale Context-Aware Module (MCAM). The HVM Block utilizes the Hilbert curve to reduce the dimensionality of two-dimensional images and applies a selective scanning algorithm in Mamba, enabling the network to effectively capture local dependencies while maintaining a global receptive field, thereby optimizing the consistency of intra-class semantic information. The SAFM module effectively merges local detail information from shallow networks with global semantic information from deep networks, alleviating the problem of local detail information loss. Finally, the MCAM module, introduced at the end of the network, enhances the model,s ability to judge contextual information, thereby improving segmentation accuracy. Experimental results show that HMSNet achieves an excellent balance between segmentation accuracy and inference speed on challenging public datasets, including CamVid, Cityscapes, and ADE20K.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"172 \",\"pages\":\"Article 112457\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325011203\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325011203","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
HMSNet: Hilbert curve enhanced Mamba for real-time semantic segmentation
Semantic segmentation is a core technology for vehicle perception of the surrounding environment in autonomous driving. However, existing real-time semantic segmentation models face two major challenges: loss of local detail information and inconsistency of intra-class semantic information. To address these issues, we propose a novel network architecture, HMSNet. The network mainly consists of the following three core modules: the Hilbert curve enhanced Visual Mamba Block (HVM Block), Selective Attention Fusion Module (SAFM), and Multi-scale Context-Aware Module (MCAM). The HVM Block utilizes the Hilbert curve to reduce the dimensionality of two-dimensional images and applies a selective scanning algorithm in Mamba, enabling the network to effectively capture local dependencies while maintaining a global receptive field, thereby optimizing the consistency of intra-class semantic information. The SAFM module effectively merges local detail information from shallow networks with global semantic information from deep networks, alleviating the problem of local detail information loss. Finally, the MCAM module, introduced at the end of the network, enhances the model,s ability to judge contextual information, thereby improving segmentation accuracy. Experimental results show that HMSNet achieves an excellent balance between segmentation accuracy and inference speed on challenging public datasets, including CamVid, Cityscapes, and ADE20K.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.