多尺度特征融合引导的轻量级语义分割网络

IF 4.2 2区 计算机科学 Q2 ROBOTICS
Xin Ye, Junchen Pan, Jichen Chen, Jingbo Zhang
{"title":"多尺度特征融合引导的轻量级语义分割网络","authors":"Xin Ye,&nbsp;Junchen Pan,&nbsp;Jichen Chen,&nbsp;Jingbo Zhang","doi":"10.1002/rob.22406","DOIUrl":null,"url":null,"abstract":"<p>Semantic segmentation, a task of assigning class labels to each pixel in an image, has found applications in various real-world scenarios, including autonomous driving and scene understanding. However, its widespread use is hindered by the high computational burden. In this paper, we propose an efficient semantic segmentation method based on Feature Cascade Fusion Network (FCFNet) to address this challenge. FCFNet utilizes a dual-path framework comprising the Spatial Information Path (SIP) and the Context Information Path (CIP). SIP is a shallow structure that captures the local dependencies of each pixel to improve the accuracy of detailed segmentation. CIP is the main branch with a deeper structure that captures sufficient contextual information from input features. Moreover, we design an Efficient Receptive Field Module (ERFM) to enlarge the receptive field in the SIP. Meanwhile, Attention Shuffled Refinement Module is used to refine feature maps from different stages. Finally, we present an Attention-Guided Fusion Module to fuse the low- and high-level feature maps effectively. Experimental results show that our proposed FCFNet achieves 70.7% mean intersection over union (mIoU) on the Cityscapes data set and 68.1% mIoU on the CamVid data set, respectively, with inference speeds of 110 and 100 frames per second (FPS), respectively. Additionally, we evaluated FCFNet on the Nvidia Jetson Xavier embedded device, which demonstrated competitive performance while significantly reducing power consumption.</p>","PeriodicalId":192,"journal":{"name":"Journal of Field Robotics","volume":"42 1","pages":"272-286"},"PeriodicalIF":4.2000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A multiscale feature fusion-guided lightweight semantic segmentation network\",\"authors\":\"Xin Ye,&nbsp;Junchen Pan,&nbsp;Jichen Chen,&nbsp;Jingbo Zhang\",\"doi\":\"10.1002/rob.22406\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Semantic segmentation, a task of assigning class labels to each pixel in an image, has found applications in various real-world scenarios, including autonomous driving and scene understanding. However, its widespread use is hindered by the high computational burden. In this paper, we propose an efficient semantic segmentation method based on Feature Cascade Fusion Network (FCFNet) to address this challenge. FCFNet utilizes a dual-path framework comprising the Spatial Information Path (SIP) and the Context Information Path (CIP). SIP is a shallow structure that captures the local dependencies of each pixel to improve the accuracy of detailed segmentation. CIP is the main branch with a deeper structure that captures sufficient contextual information from input features. Moreover, we design an Efficient Receptive Field Module (ERFM) to enlarge the receptive field in the SIP. Meanwhile, Attention Shuffled Refinement Module is used to refine feature maps from different stages. Finally, we present an Attention-Guided Fusion Module to fuse the low- and high-level feature maps effectively. Experimental results show that our proposed FCFNet achieves 70.7% mean intersection over union (mIoU) on the Cityscapes data set and 68.1% mIoU on the CamVid data set, respectively, with inference speeds of 110 and 100 frames per second (FPS), respectively. Additionally, we evaluated FCFNet on the Nvidia Jetson Xavier embedded device, which demonstrated competitive performance while significantly reducing power consumption.</p>\",\"PeriodicalId\":192,\"journal\":{\"name\":\"Journal of Field Robotics\",\"volume\":\"42 1\",\"pages\":\"272-286\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2024-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Field Robotics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/rob.22406\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Field Robotics","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/rob.22406","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

摘要

语义分割是一项为图像中的每个像素分配类别标签的任务,已在自动驾驶和场景理解等各种现实世界场景中得到应用。然而,高计算负担阻碍了它的广泛应用。本文提出了一种基于特征级联融合网络(FCFNet)的高效语义分割方法来应对这一挑战。FCFNet 采用双路径框架,包括空间信息路径(SIP)和上下文信息路径(CIP)。SIP 是一种浅层结构,可捕捉每个像素的局部相关性,从而提高详细分割的准确性。CIP 是具有较深结构的主要分支,能从输入特征中捕捉足够的上下文信息。此外,我们还设计了一个高效感受野模块(ERFM)来扩大 SIP 中的感受野。同时,注意力洗牌细化模块用于细化不同阶段的特征图。最后,我们提出了注意力引导融合模块,以有效融合低级和高级特征图。实验结果表明,我们提出的 FCFNet 在 Cityscapes 数据集和 CamVid 数据集上分别实现了 70.7% 和 68.1% 的平均交叉率,推理速度分别为每秒 110 帧和 100 帧。此外,我们还在 Nvidia Jetson Xavier 嵌入式设备上对 FCFNet 进行了评估,该设备在大幅降低功耗的同时,表现出了极具竞争力的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A multiscale feature fusion-guided lightweight semantic segmentation network

Semantic segmentation, a task of assigning class labels to each pixel in an image, has found applications in various real-world scenarios, including autonomous driving and scene understanding. However, its widespread use is hindered by the high computational burden. In this paper, we propose an efficient semantic segmentation method based on Feature Cascade Fusion Network (FCFNet) to address this challenge. FCFNet utilizes a dual-path framework comprising the Spatial Information Path (SIP) and the Context Information Path (CIP). SIP is a shallow structure that captures the local dependencies of each pixel to improve the accuracy of detailed segmentation. CIP is the main branch with a deeper structure that captures sufficient contextual information from input features. Moreover, we design an Efficient Receptive Field Module (ERFM) to enlarge the receptive field in the SIP. Meanwhile, Attention Shuffled Refinement Module is used to refine feature maps from different stages. Finally, we present an Attention-Guided Fusion Module to fuse the low- and high-level feature maps effectively. Experimental results show that our proposed FCFNet achieves 70.7% mean intersection over union (mIoU) on the Cityscapes data set and 68.1% mIoU on the CamVid data set, respectively, with inference speeds of 110 and 100 frames per second (FPS), respectively. Additionally, we evaluated FCFNet on the Nvidia Jetson Xavier embedded device, which demonstrated competitive performance while significantly reducing power consumption.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Field Robotics
Journal of Field Robotics 工程技术-机器人学
CiteScore
15.00
自引率
3.60%
发文量
80
审稿时长
6 months
期刊介绍: The Journal of Field Robotics seeks to promote scholarly publications dealing with the fundamentals of robotics in unstructured and dynamic environments. The Journal focuses on experimental robotics and encourages publication of work that has both theoretical and practical significance.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信