FDLNet: Boosting Real-time Semantic Segmentation by Image-size Convolution via Frequency Domain Learning

Qingqing Yan, Shu Li, Chengju Liu, Meilin Liu, Qi Chen
{"title":"FDLNet: Boosting Real-time Semantic Segmentation by Image-size Convolution via Frequency Domain Learning","authors":"Qingqing Yan, Shu Li, Chengju Liu, Meilin Liu, Qi Chen","doi":"10.1109/ICRA48891.2023.10161421","DOIUrl":null,"url":null,"abstract":"This paper proposes a novel real-time semantic segmentation network via frequency domain learning, called FDLNet, which revisits the segmentation task from two critical perspectives: spatial structure description and multilevel feature fusion. We first devise an image-size convolution (IS-Conv) as a global frequency-domain learning operator to capture long-range dependency in a single shot. To model spatial structure information, we construct the global structure representation path (GSRP) based on IS-Conv, which learns a unified edge-region representation with affordable complexity. For efficient and lightweight multi-level feature fusion, we propose the factorized stereoscopic attention (FSA) module, which alleviates semantic confusion and reduces feature redundancy by introducing level-wise attention before channel and spatial attention. Combining the above modules, we propose a concise semantic segmentation framework named FDLNet. We experimentally demonstrate the effectiveness and superiority of the proposed method. FDLNet achieves state-of-the-art performance on the Cityscapes, which reports 76.32% mIoU at 150+ FPS and 79.0% mIoU at 41+ FPS. The code is available at https://github.com/qyan0131/FDLNet.","PeriodicalId":360533,"journal":{"name":"2023 IEEE International Conference on Robotics and Automation (ICRA)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA48891.2023.10161421","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper proposes a novel real-time semantic segmentation network via frequency domain learning, called FDLNet, which revisits the segmentation task from two critical perspectives: spatial structure description and multilevel feature fusion. We first devise an image-size convolution (IS-Conv) as a global frequency-domain learning operator to capture long-range dependency in a single shot. To model spatial structure information, we construct the global structure representation path (GSRP) based on IS-Conv, which learns a unified edge-region representation with affordable complexity. For efficient and lightweight multi-level feature fusion, we propose the factorized stereoscopic attention (FSA) module, which alleviates semantic confusion and reduces feature redundancy by introducing level-wise attention before channel and spatial attention. Combining the above modules, we propose a concise semantic segmentation framework named FDLNet. We experimentally demonstrate the effectiveness and superiority of the proposed method. FDLNet achieves state-of-the-art performance on the Cityscapes, which reports 76.32% mIoU at 150+ FPS and 79.0% mIoU at 41+ FPS. The code is available at https://github.com/qyan0131/FDLNet.
FDLNet:基于频域学习的图像大小卷积增强实时语义分割
本文提出了一种基于频域学习的实时语义分割网络FDLNet,该网络从空间结构描述和多层次特征融合两个关键角度重新审视了语义分割任务。我们首先设计了一个图像大小的卷积(IS-Conv)作为一个全局频域学习算子,以捕获单个镜头中的远程依赖关系。为了对空间结构信息进行建模,我们构建了基于IS-Conv的全局结构表示路径(GSRP),该路径学习了一个统一的边缘区域表示,且复杂度可承受。为了实现高效、轻量级的多层次特征融合,我们提出了分解立体注意模块(factorized stereoscopic attention, FSA),该模块通过在通道和空间注意之前引入分层注意来缓解语义混淆和减少特征冗余。结合以上模块,我们提出了一个简洁的语义分割框架FDLNet。实验证明了该方法的有效性和优越性。FDLNet在cityscape上实现了最先进的性能,在150+ FPS下mIoU为76.32%,在41+ FPS下mIoU为79.0%。代码可在https://github.com/qyan0131/FDLNet上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信