Lin Wang;Fenghua Zhu;Hui Zhang;Gang Xiong;Yunhu Huang;Dewang Chen
{"title":"MSSINet: Real-Time Segmentation Based on Multi-Scale Strip Integration","authors":"Lin Wang;Fenghua Zhu;Hui Zhang;Gang Xiong;Yunhu Huang;Dewang Chen","doi":"10.1109/JRFID.2024.3389088","DOIUrl":null,"url":null,"abstract":"Semantic segmentation plays a fundamental role in computer vision, underpinning applications such as autonomous driving and scene analysis. Although dual-branch networks have marked advancements in accuracy and processing speed, they falter in the context extraction phase within the low-resolution branch. Traditionally, square pooling is used at this juncture, leading to the oversight of stripe-shaped contextual information. In response, we introduce a novel architecture based on a deep aggregation pyramid, engineered for both real-time processing and precise segmentation. Central to our approach is a pioneering contextual information extractor designed to expand the effective receptive fields and fuse multi-scale context from low-resolution feature maps. Additionally, we have developed a feature fusion module to enhance the integration and differentiation of high-level semantic information across branches. To further refine the fidelity of segmentation, we implement dual deep supervisions within the high-resolution branchs intermediate layer, concentrating on boundary delineation and global features to enrich spatial detail capture. Our comprehensive experimental analysis, conducted on the Cityscapes and CamVid datasets, affirms MSSINets superior performance, showcasing its competitiveness against existing leading methodologies across a variety of scenarios.","PeriodicalId":73291,"journal":{"name":"IEEE journal of radio frequency identification","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE journal of radio frequency identification","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10500690/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Semantic segmentation plays a fundamental role in computer vision, underpinning applications such as autonomous driving and scene analysis. Although dual-branch networks have marked advancements in accuracy and processing speed, they falter in the context extraction phase within the low-resolution branch. Traditionally, square pooling is used at this juncture, leading to the oversight of stripe-shaped contextual information. In response, we introduce a novel architecture based on a deep aggregation pyramid, engineered for both real-time processing and precise segmentation. Central to our approach is a pioneering contextual information extractor designed to expand the effective receptive fields and fuse multi-scale context from low-resolution feature maps. Additionally, we have developed a feature fusion module to enhance the integration and differentiation of high-level semantic information across branches. To further refine the fidelity of segmentation, we implement dual deep supervisions within the high-resolution branchs intermediate layer, concentrating on boundary delineation and global features to enrich spatial detail capture. Our comprehensive experimental analysis, conducted on the Cityscapes and CamVid datasets, affirms MSSINets superior performance, showcasing its competitiveness against existing leading methodologies across a variety of scenarios.