MDRNet:用于街景实时语义分割的轻量级网络

IF 1.7 4区计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS

Assembly Automation Pub Date : 2021-10-25 DOI:10.1108/aa-06-2021-0078

Yingpeng Dai, Junzheng Wang, Jiehao Li, Jing Li

{"title":"MDRNet:用于街景实时语义分割的轻量级网络","authors":"Yingpeng Dai, Junzheng Wang, Jiehao Li, Jing Li","doi":"10.1108/aa-06-2021-0078","DOIUrl":null,"url":null,"abstract":"\nPurpose\nThis paper aims to focus on the environmental perception of unmanned platform under complex street scenes. Unmanned platform has a strict requirement both on accuracy and inference speed. So how to make a trade-off between accuracy and inference speed during the extraction of environmental information becomes a challenge.\n\n\nDesign/methodology/approach\nIn this paper, a novel multi-scale depth-wise residual (MDR) module is proposed. This module makes full use of depth-wise separable convolution, dilated convolution and 1-dimensional (1-D) convolution, which is able to extract local information and contextual information jointly while keeping this module small-scale and shallow. Then, based on MDR module, a novel network named multi-scale depth-wise residual network (MDRNet) is designed for fast semantic segmentation. This network could extract multi-scale information and maintain feature maps with high spatial resolution to mitigate the existence of objects at multiple scales.\n\n\nFindings\nExperiments on Camvid data set and Cityscapes data set reveal that the proposed MDRNet produces competitive results both in terms of computational time and accuracy during inference. Specially, the authors got 67.47 and 68.7% Mean Intersection over Union (MIoU) on Camvid data set and Cityscapes data set, respectively, with only 0.84 million parameters and quicker speed on a single GTX 1070Ti card.\n\n\nOriginality/value\nThis research can provide the theoretical and engineering basis for environmental perception on the unmanned platform. In addition, it provides environmental information to support the subsequent works.\n","PeriodicalId":55448,"journal":{"name":"Assembly Automation","volume":" ","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2021-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"MDRNet: a lightweight network for real-time semantic segmentation in street scenes\",\"authors\":\"Yingpeng Dai, Junzheng Wang, Jiehao Li, Jing Li\",\"doi\":\"10.1108/aa-06-2021-0078\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\nPurpose\\nThis paper aims to focus on the environmental perception of unmanned platform under complex street scenes. Unmanned platform has a strict requirement both on accuracy and inference speed. So how to make a trade-off between accuracy and inference speed during the extraction of environmental information becomes a challenge.\\n\\n\\nDesign/methodology/approach\\nIn this paper, a novel multi-scale depth-wise residual (MDR) module is proposed. This module makes full use of depth-wise separable convolution, dilated convolution and 1-dimensional (1-D) convolution, which is able to extract local information and contextual information jointly while keeping this module small-scale and shallow. Then, based on MDR module, a novel network named multi-scale depth-wise residual network (MDRNet) is designed for fast semantic segmentation. This network could extract multi-scale information and maintain feature maps with high spatial resolution to mitigate the existence of objects at multiple scales.\\n\\n\\nFindings\\nExperiments on Camvid data set and Cityscapes data set reveal that the proposed MDRNet produces competitive results both in terms of computational time and accuracy during inference. Specially, the authors got 67.47 and 68.7% Mean Intersection over Union (MIoU) on Camvid data set and Cityscapes data set, respectively, with only 0.84 million parameters and quicker speed on a single GTX 1070Ti card.\\n\\n\\nOriginality/value\\nThis research can provide the theoretical and engineering basis for environmental perception on the unmanned platform. In addition, it provides environmental information to support the subsequent works.\\n\",\"PeriodicalId\":55448,\"journal\":{\"name\":\"Assembly Automation\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2021-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Assembly Automation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1108/aa-06-2021-0078\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Assembly Automation","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1108/aa-06-2021-0078","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 16

摘要

本文主要研究复杂街景下无人平台的环境感知。无人平台对精度和推理速度都有严格的要求。因此，如何在环境信息提取的准确性和推理速度之间取得平衡成为一个挑战。本文提出了一种新的多尺度深度残差(MDR)模型。该模块充分利用了深度可分卷积、展开卷积和一维卷积，既能同时提取局部信息和上下文信息，又能保持模块的小规模和浅层性。然后，在MDR模块的基础上，设计了多尺度深度残差网络(multi-scale depth-wise residual network, MDRNet)，实现了快速语义分割。该网络可以提取多尺度信息，并保持高空间分辨率的特征图，以减轻多尺度目标的存在。在Camvid数据集和cityscape数据集上的实验表明，所提出的MDRNet在推理过程中的计算时间和准确性方面都具有竞争力。特别是，在Camvid数据集和cityscape数据集上，仅84万个参数和更快的速度，分别获得了67.47%和68.7%的平均交叉口超过Union (MIoU)。独创性/价值本研究可为无人平台环境感知提供理论和工程依据。此外，它还提供环境资料，以支持后续的工作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MDRNet: a lightweight network for real-time semantic segmentation in street scenes

Purpose This paper aims to focus on the environmental perception of unmanned platform under complex street scenes. Unmanned platform has a strict requirement both on accuracy and inference speed. So how to make a trade-off between accuracy and inference speed during the extraction of environmental information becomes a challenge. Design/methodology/approach In this paper, a novel multi-scale depth-wise residual (MDR) module is proposed. This module makes full use of depth-wise separable convolution, dilated convolution and 1-dimensional (1-D) convolution, which is able to extract local information and contextual information jointly while keeping this module small-scale and shallow. Then, based on MDR module, a novel network named multi-scale depth-wise residual network (MDRNet) is designed for fast semantic segmentation. This network could extract multi-scale information and maintain feature maps with high spatial resolution to mitigate the existence of objects at multiple scales. Findings Experiments on Camvid data set and Cityscapes data set reveal that the proposed MDRNet produces competitive results both in terms of computational time and accuracy during inference. Specially, the authors got 67.47 and 68.7% Mean Intersection over Union (MIoU) on Camvid data set and Cityscapes data set, respectively, with only 0.84 million parameters and quicker speed on a single GTX 1070Ti card. Originality/value This research can provide the theoretical and engineering basis for environmental perception on the unmanned platform. In addition, it provides environmental information to support the subsequent works.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Assembly Automation 工程技术-工程：制造

CiteScore

4.30

自引率

14.30%

发文量

审稿时长

3.3 months

期刊介绍： Assembly Automation publishes peer reviewed research articles, technology reviews and specially commissioned case studies. Each issue includes high quality content covering all aspects of assembly technology and automation, and reflecting the most interesting and strategically important research and development activities from around the world. Because of this, readers can stay at the very forefront of industry developments. All research articles undergo rigorous double-blind peer review, and the journal’s policy of not publishing work that has only been tested in simulation means that only the very best and most practical research articles are included. This ensures that the material that is published has real relevance and value for commercial manufacturing and research organizations.