Urban street scene analysis using lightweight multi-level multi-path feature aggregation network

IF 0.6 Q4 COMPUTER SCIENCE, THEORY & METHODS
Tanmay Singha, Duc-Son Pham, A. Krishna
{"title":"Urban street scene analysis using lightweight multi-level multi-path feature aggregation network","authors":"Tanmay Singha, Duc-Son Pham, A. Krishna","doi":"10.3233/mgs-210353","DOIUrl":null,"url":null,"abstract":"Urban street scene analysis is an important problem in computer vision with many off-line models achieving outstanding semantic segmentation results. However, it is an ongoing challenge for the research community to develop and optimize the deep neural architecture with real-time low computing requirements whilst maintaining good performance. Balancing between model complexity and performance has been a major hurdle with many models dropping too much accuracy for a slight reduction in model size and unable to handle high-resolution input images. The study aims to address this issue with a novel model, named M2FANet, that provides a much better balance between model’s efficiency and accuracy for scene segmentation than other alternatives. The proposed optimised backbone helps to increase model’s efficiency whereas, suggested Multi-level Multi-path (M2) feature aggregation approach enhances model’s performance in the real-time environment. By exploiting multi-feature scaling technique, M2FANet produces state-of-the-art results in resource-constrained situations by handling full input resolution. On the Cityscapes benchmark data set, the proposed model produces 68.5% and 68.3% class accuracy on validation and test sets respectively, whilst having only 1.3 million parameters. Compared with all real-time models of less than 5 million parameters, the proposed model is the most competitive in both performance and real-time capability.","PeriodicalId":43659,"journal":{"name":"Multiagent and Grid Systems","volume":null,"pages":null},"PeriodicalIF":0.6000,"publicationDate":"2021-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multiagent and Grid Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/mgs-210353","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 2

Abstract

Urban street scene analysis is an important problem in computer vision with many off-line models achieving outstanding semantic segmentation results. However, it is an ongoing challenge for the research community to develop and optimize the deep neural architecture with real-time low computing requirements whilst maintaining good performance. Balancing between model complexity and performance has been a major hurdle with many models dropping too much accuracy for a slight reduction in model size and unable to handle high-resolution input images. The study aims to address this issue with a novel model, named M2FANet, that provides a much better balance between model’s efficiency and accuracy for scene segmentation than other alternatives. The proposed optimised backbone helps to increase model’s efficiency whereas, suggested Multi-level Multi-path (M2) feature aggregation approach enhances model’s performance in the real-time environment. By exploiting multi-feature scaling technique, M2FANet produces state-of-the-art results in resource-constrained situations by handling full input resolution. On the Cityscapes benchmark data set, the proposed model produces 68.5% and 68.3% class accuracy on validation and test sets respectively, whilst having only 1.3 million parameters. Compared with all real-time models of less than 5 million parameters, the proposed model is the most competitive in both performance and real-time capability.
基于轻量级多层次多路径特征聚合网络的城市街景分析
城市街景分析是计算机视觉中的一个重要问题,许多离线模型都取得了出色的语义分割效果。然而,如何在保持良好性能的同时,开发和优化实时低计算需求的深度神经系统架构是研究领域面临的一个持续挑战。在模型复杂性和性能之间的平衡一直是一个主要的障碍,许多模型因为模型尺寸的轻微减少而降低了太多的精度,并且无法处理高分辨率的输入图像。该研究旨在通过一个名为M2FANet的新模型来解决这个问题,该模型在场景分割的效率和准确性之间提供了比其他替代模型更好的平衡。所提出的优化主干有助于提高模型的效率,而所提出的多层次多路径(M2)特征聚合方法提高了模型在实时环境中的性能。通过利用多特征缩放技术,M2FANet通过处理全输入分辨率在资源受限的情况下产生最先进的结果。在cityscape基准数据集上,该模型在验证集和测试集上的分类准确率分别为68.5%和68.3%,而只有130万个参数。与所有小于500万个参数的实时模型相比,该模型在性能和实时性方面都是最具竞争力的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Multiagent and Grid Systems
Multiagent and Grid Systems COMPUTER SCIENCE, THEORY & METHODS-
CiteScore
1.50
自引率
0.00%
发文量
13
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信