基于轻量级多层次多路径特征聚合网络的城市街景分析

IF 0.4 Q4 COMPUTER SCIENCE, THEORY & METHODS

Multiagent and Grid Systems Pub Date : 2021-12-20 DOI:10.3233/mgs-210353

Tanmay Singha, Duc-Son Pham, A. Krishna

{"title":"基于轻量级多层次多路径特征聚合网络的城市街景分析","authors":"Tanmay Singha, Duc-Son Pham, A. Krishna","doi":"10.3233/mgs-210353","DOIUrl":null,"url":null,"abstract":"Urban street scene analysis is an important problem in computer vision with many off-line models achieving outstanding semantic segmentation results. However, it is an ongoing challenge for the research community to develop and optimize the deep neural architecture with real-time low computing requirements whilst maintaining good performance. Balancing between model complexity and performance has been a major hurdle with many models dropping too much accuracy for a slight reduction in model size and unable to handle high-resolution input images. The study aims to address this issue with a novel model, named M2FANet, that provides a much better balance between model’s efficiency and accuracy for scene segmentation than other alternatives. The proposed optimised backbone helps to increase model’s efficiency whereas, suggested Multi-level Multi-path (M2) feature aggregation approach enhances model’s performance in the real-time environment. By exploiting multi-feature scaling technique, M2FANet produces state-of-the-art results in resource-constrained situations by handling full input resolution. On the Cityscapes benchmark data set, the proposed model produces 68.5% and 68.3% class accuracy on validation and test sets respectively, whilst having only 1.3 million parameters. Compared with all real-time models of less than 5 million parameters, the proposed model is the most competitive in both performance and real-time capability.","PeriodicalId":43659,"journal":{"name":"Multiagent and Grid Systems","volume":"32 3 1","pages":"249-271"},"PeriodicalIF":0.4000,"publicationDate":"2021-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Urban street scene analysis using lightweight multi-level multi-path feature aggregation network\",\"authors\":\"Tanmay Singha, Duc-Son Pham, A. Krishna\",\"doi\":\"10.3233/mgs-210353\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Urban street scene analysis is an important problem in computer vision with many off-line models achieving outstanding semantic segmentation results. However, it is an ongoing challenge for the research community to develop and optimize the deep neural architecture with real-time low computing requirements whilst maintaining good performance. Balancing between model complexity and performance has been a major hurdle with many models dropping too much accuracy for a slight reduction in model size and unable to handle high-resolution input images. The study aims to address this issue with a novel model, named M2FANet, that provides a much better balance between model’s efficiency and accuracy for scene segmentation than other alternatives. The proposed optimised backbone helps to increase model’s efficiency whereas, suggested Multi-level Multi-path (M2) feature aggregation approach enhances model’s performance in the real-time environment. By exploiting multi-feature scaling technique, M2FANet produces state-of-the-art results in resource-constrained situations by handling full input resolution. On the Cityscapes benchmark data set, the proposed model produces 68.5% and 68.3% class accuracy on validation and test sets respectively, whilst having only 1.3 million parameters. Compared with all real-time models of less than 5 million parameters, the proposed model is the most competitive in both performance and real-time capability.\",\"PeriodicalId\":43659,\"journal\":{\"name\":\"Multiagent and Grid Systems\",\"volume\":\"32 3 1\",\"pages\":\"249-271\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2021-12-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Multiagent and Grid Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/mgs-210353\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multiagent and Grid Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/mgs-210353","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 2

摘要

城市街景分析是计算机视觉中的一个重要问题，许多离线模型都取得了出色的语义分割效果。然而，如何在保持良好性能的同时，开发和优化实时低计算需求的深度神经系统架构是研究领域面临的一个持续挑战。在模型复杂性和性能之间的平衡一直是一个主要的障碍，许多模型因为模型尺寸的轻微减少而降低了太多的精度，并且无法处理高分辨率的输入图像。该研究旨在通过一个名为M2FANet的新模型来解决这个问题，该模型在场景分割的效率和准确性之间提供了比其他替代模型更好的平衡。所提出的优化主干有助于提高模型的效率，而所提出的多层次多路径(M2)特征聚合方法提高了模型在实时环境中的性能。通过利用多特征缩放技术，M2FANet通过处理全输入分辨率在资源受限的情况下产生最先进的结果。在cityscape基准数据集上，该模型在验证集和测试集上的分类准确率分别为68.5%和68.3%，而只有130万个参数。与所有小于500万个参数的实时模型相比，该模型在性能和实时性方面都是最具竞争力的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Urban street scene analysis using lightweight multi-level multi-path feature aggregation network

Urban street scene analysis is an important problem in computer vision with many off-line models achieving outstanding semantic segmentation results. However, it is an ongoing challenge for the research community to develop and optimize the deep neural architecture with real-time low computing requirements whilst maintaining good performance. Balancing between model complexity and performance has been a major hurdle with many models dropping too much accuracy for a slight reduction in model size and unable to handle high-resolution input images. The study aims to address this issue with a novel model, named M2FANet, that provides a much better balance between model’s efficiency and accuracy for scene segmentation than other alternatives. The proposed optimised backbone helps to increase model’s efficiency whereas, suggested Multi-level Multi-path (M2) feature aggregation approach enhances model’s performance in the real-time environment. By exploiting multi-feature scaling technique, M2FANet produces state-of-the-art results in resource-constrained situations by handling full input resolution. On the Cityscapes benchmark data set, the proposed model produces 68.5% and 68.3% class accuracy on validation and test sets respectively, whilst having only 1.3 million parameters. Compared with all real-time models of less than 5 million parameters, the proposed model is the most competitive in both performance and real-time capability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Multiagent and Grid Systems COMPUTER SCIENCE, THEORY & METHODS-

CiteScore

1.50

自引率

0.00%

发文量