A Low-Rank CNN Architecture for Real-Time Semantic Segmentation in Visual SLAM Applications

IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Laura Falaschetti;Lorenzo Manoni;Claudio Turchetti
{"title":"A Low-Rank CNN Architecture for Real-Time Semantic Segmentation in Visual SLAM Applications","authors":"Laura Falaschetti;Lorenzo Manoni;Claudio Turchetti","doi":"10.1109/OJCAS.2022.3174632","DOIUrl":null,"url":null,"abstract":"Real-time semantic segmentation on embedded devices has recently enjoyed significant gain in popularity, due to the increasing interest in smart vehicles and smart robots. In particular, with the emergence of autonomous driving, low latency and computation-intensive operations lead to new challenges for vehicles and robots, such as excessive computing power and energy consumption. The aim of this paper is to address semantic segmentation, one of the most critical tasks for the perception of the environment, and its implementation in a low power core, by preserving the required performance of accuracy and low complexity. To reach this goal a low-rank convolutional neural network (CNN) architecture for real-time semantic segmentation is proposed. The main contributions of this paper are: \n<italic>i)</i>\n a tensor decomposition technique has been applied to the kernel of a generic convolutional layer, \n<italic>ii)</i>\n three versions of an optimized architecture, that combines UNet and ResNet models, have been derived to explore the trade-off between model complexity and accuracy, \n<italic>iii)</i>\n the low-rank CNN architectures have been implemented in a Raspberry Pi 4 and NVIDIA Jetson Nano 2 GB embedded platforms, as severe benchmarks to meet the low-power, low-cost requirements, and in the high-cost GPU NVIDIA Tesla P100 PCIe 16 GB to meet the best performance.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2022-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9773325","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of circuits and systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/9773325/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 3

Abstract

Real-time semantic segmentation on embedded devices has recently enjoyed significant gain in popularity, due to the increasing interest in smart vehicles and smart robots. In particular, with the emergence of autonomous driving, low latency and computation-intensive operations lead to new challenges for vehicles and robots, such as excessive computing power and energy consumption. The aim of this paper is to address semantic segmentation, one of the most critical tasks for the perception of the environment, and its implementation in a low power core, by preserving the required performance of accuracy and low complexity. To reach this goal a low-rank convolutional neural network (CNN) architecture for real-time semantic segmentation is proposed. The main contributions of this paper are: i) a tensor decomposition technique has been applied to the kernel of a generic convolutional layer, ii) three versions of an optimized architecture, that combines UNet and ResNet models, have been derived to explore the trade-off between model complexity and accuracy, iii) the low-rank CNN architectures have been implemented in a Raspberry Pi 4 and NVIDIA Jetson Nano 2 GB embedded platforms, as severe benchmarks to meet the low-power, low-cost requirements, and in the high-cost GPU NVIDIA Tesla P100 PCIe 16 GB to meet the best performance.
一种用于视觉SLAM应用中实时语义分割的低秩CNN架构
由于对智能车辆和智能机器人的兴趣日益浓厚,嵌入式设备上的实时语义分割最近得到了显著的普及。特别是,随着自动驾驶的出现,低延迟和计算密集型操作给车辆和机器人带来了新的挑战,例如过度的计算能力和能耗。本文的目的是解决语义分割,这是感知环境的最关键任务之一,并通过保持所需的准确性和低复杂性的性能,在低功耗核心中实现。为了实现这一目标,提出了一种用于实时语义分割的低秩卷积神经网络(CNN)架构。本文的主要贡献有:i)将张量分解技术应用于通用卷积层的内核;ii)导出了三个版本的优化架构,结合UNet和ResNet模型,以探索模型复杂性和准确性之间的权衡;iii)低秩CNN架构已在Raspberry Pi 4和NVIDIA Jetson Nano 2gb嵌入式平台上实现,作为满足低功耗,低成本要求的严格基准。而在高成本的GPU NVIDIA Tesla P100 PCIe 16gb满足最佳性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
审稿时长
19 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信