An Efficient Multi-View Cross-Attention Accelerator for Vision-Centric 3D Perception in Autonomous Driving

IF 5.2 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Dongxu Lyu;Zhenyu Li;Yansong Xu;Gang Wang;Wenjie Li;Yuzhou Chen;Liyan Chen;Weifeng He;Guanghui He
{"title":"An Efficient Multi-View Cross-Attention Accelerator for Vision-Centric 3D Perception in Autonomous Driving","authors":"Dongxu Lyu;Zhenyu Li;Yansong Xu;Gang Wang;Wenjie Li;Yuzhou Chen;Liyan Chen;Weifeng He;Guanghui He","doi":"10.1109/TCSI.2025.3555837","DOIUrl":null,"url":null,"abstract":"Vision-centric 3D perception has become a key mechanism in autonomous driving. It achieves exceptional perceptual performance mainly by introducing a novel attention, multi-view cross-attention (MVCA), for learnable feature extraction and fusion from surround-view cameras. Despite its superiority, MVCA encounters severe inefficiencies in sample, processing elements (PE), and pipelined processing, owing to the redundant and non-uniform sampling-aggregation and rigorous inter-operator dependencies. To address these issues, this article proposes a dedicated MVCA accelerator, MVAtor, with algorithm-architecture co-optimization for vision-centric 3D perception based on multi-view inputs flexibly. For sample inefficiency, a 3-tier hybrid static-dynamic sample and a sensitivity-aware feature pruning approach are proposed to eliminate the 86.03% sample overhead and 24.48% memory requirement, only incuring <1%> <tex-math>$53.7\\sim 96.1$ </tex-math></inline-formula>% energy-delay product reduction. For pipeline inefficiency, a fine-grained-tiling assisted highly-pipelined architecture is constructed in MVAtor by exploiting the decoupling opportunities on inter-view sparsity, thereby saving 61.03% external memory access while boosting the overall throughputs by <inline-formula> <tex-math>$1.83\\times $ </tex-math></inline-formula>. Extensively evaluated on representative benchmarks, MVAtor attains <inline-formula> <tex-math>$1.38\\sim 7.67\\times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$1.67\\sim 11.15\\times $ </tex-math></inline-formula> improvement on energy and area efficiency respectively, compared to the state-of-the-art related accelerators.","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"72 7","pages":"3272-3285"},"PeriodicalIF":5.2000,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems I: Regular Papers","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10950425/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Vision-centric 3D perception has become a key mechanism in autonomous driving. It achieves exceptional perceptual performance mainly by introducing a novel attention, multi-view cross-attention (MVCA), for learnable feature extraction and fusion from surround-view cameras. Despite its superiority, MVCA encounters severe inefficiencies in sample, processing elements (PE), and pipelined processing, owing to the redundant and non-uniform sampling-aggregation and rigorous inter-operator dependencies. To address these issues, this article proposes a dedicated MVCA accelerator, MVAtor, with algorithm-architecture co-optimization for vision-centric 3D perception based on multi-view inputs flexibly. For sample inefficiency, a 3-tier hybrid static-dynamic sample and a sensitivity-aware feature pruning approach are proposed to eliminate the 86.03% sample overhead and 24.48% memory requirement, only incuring <1%> $53.7\sim 96.1$ % energy-delay product reduction. For pipeline inefficiency, a fine-grained-tiling assisted highly-pipelined architecture is constructed in MVAtor by exploiting the decoupling opportunities on inter-view sparsity, thereby saving 61.03% external memory access while boosting the overall throughputs by $1.83\times $ . Extensively evaluated on representative benchmarks, MVAtor attains $1.38\sim 7.67\times $ and $1.67\sim 11.15\times $ improvement on energy and area efficiency respectively, compared to the state-of-the-art related accelerators.
面向自动驾驶中以视觉为中心的3D感知的高效多视点交叉注意加速器
以视觉为中心的三维感知已经成为自动驾驶的关键机制。它主要通过引入一种新颖的多视图交叉注意(multi-view cross-attention, MVCA)来实现卓越的感知性能,用于从环视相机中提取可学习的特征和融合。尽管具有优势,但由于冗余和不均匀的采样聚合以及严格的算子间依赖关系,MVCA在样本、处理元素(PE)和流水线处理方面存在严重的效率低下问题。为了解决这些问题,本文提出了一个专用的MVCA加速器MVAtor,该加速器具有算法架构协同优化,可灵活地基于多视图输入实现以视觉为中心的3D感知。对于样本效率低下,提出了一种3层混合静态动态样本和灵敏度感知特征剪枝方法,消除了86.03%的样本开销和24.48%的内存需求,仅减少了53.7美元/ 96.1美元的能量延迟产品。对于流水线效率低下的问题,MVAtor通过利用视图间稀疏性的解耦机会,构建了细粒度平片辅助的高度流水线架构,从而节省了61.03%的外部内存访问,同时将总体吞吐量提高了1.83倍。根据代表性基准进行的广泛评估,与最先进的相关加速器相比,MVAtor在能量和面积效率方面分别提高了1.38美元和1.67美元,分别提高了7.67美元和11.15美元。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Circuits and Systems I: Regular Papers
IEEE Transactions on Circuits and Systems I: Regular Papers 工程技术-工程:电子与电气
CiteScore
9.80
自引率
11.80%
发文量
441
审稿时长
2 months
期刊介绍: TCAS I publishes regular papers in the field specified by the theory, analysis, design, and practical implementations of circuits, and the application of circuit techniques to systems and to signal processing. Included is the whole spectrum from basic scientific theory to industrial applications. The field of interest covered includes: - Circuits: Analog, Digital and Mixed Signal Circuits and Systems - Nonlinear Circuits and Systems, Integrated Sensors, MEMS and Systems on Chip, Nanoscale Circuits and Systems, Optoelectronic - Circuits and Systems, Power Electronics and Systems - Software for Analog-and-Logic Circuits and Systems - Control aspects of Circuits and Systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信