TransRAD: Retentive Vision Transformer for Enhanced Radar Object Detection

Lei Cheng;Siyang Cao
{"title":"TransRAD: Retentive Vision Transformer for Enhanced Radar Object Detection","authors":"Lei Cheng;Siyang Cao","doi":"10.1109/TRS.2025.3537604","DOIUrl":null,"url":null,"abstract":"Despite significant advancements in environment perception capabilities for autonomous driving and intelligent robotics, cameras and LiDARs remain notoriously unreliable in low-light conditions and adverse weather, which limits their effectiveness. Radar serves as a reliable and low-cost sensor that can effectively complement these limitations. However, radar-based object detection has been underexplored due to the inherent weaknesses of radar data, such as low resolution, high noise, and lack of visual information. In this article, we present TransRAD, a novel 3-D radar object detection model designed to address these challenges by leveraging the retentive vision transformer (RMT) to more effectively learn features from information-dense radar range-Azimuth–Doppler (RAD) data. Our approach leverages the retentive Manhattan self-attention (MaSA) mechanism provided by RMT to incorporate explicit spatial priors, thereby enabling more accurate alignment with the spatial saliency characteristics of radar targets in RAD data and achieving precise 3-D radar detection across RAD dimensions. Furthermore, we propose location-aware nonmaximum suppression (LA-NMS) to effectively mitigate the common issue of duplicate bounding boxes in deep radar object detection. The experimental results demonstrate that TransRAD outperforms state-of-the-art (SOTA) methods in both 2-D and 3-D radar detection tasks, achieving higher accuracy, faster inference speed, and reduced computational complexity. Code is available at <uri>https://github.com/radar-lab/TransRAD</uri>.","PeriodicalId":100645,"journal":{"name":"IEEE Transactions on Radar Systems","volume":"3 ","pages":"303-317"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Radar Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10869508/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Despite significant advancements in environment perception capabilities for autonomous driving and intelligent robotics, cameras and LiDARs remain notoriously unreliable in low-light conditions and adverse weather, which limits their effectiveness. Radar serves as a reliable and low-cost sensor that can effectively complement these limitations. However, radar-based object detection has been underexplored due to the inherent weaknesses of radar data, such as low resolution, high noise, and lack of visual information. In this article, we present TransRAD, a novel 3-D radar object detection model designed to address these challenges by leveraging the retentive vision transformer (RMT) to more effectively learn features from information-dense radar range-Azimuth–Doppler (RAD) data. Our approach leverages the retentive Manhattan self-attention (MaSA) mechanism provided by RMT to incorporate explicit spatial priors, thereby enabling more accurate alignment with the spatial saliency characteristics of radar targets in RAD data and achieving precise 3-D radar detection across RAD dimensions. Furthermore, we propose location-aware nonmaximum suppression (LA-NMS) to effectively mitigate the common issue of duplicate bounding boxes in deep radar object detection. The experimental results demonstrate that TransRAD outperforms state-of-the-art (SOTA) methods in both 2-D and 3-D radar detection tasks, achieving higher accuracy, faster inference speed, and reduced computational complexity. Code is available at https://github.com/radar-lab/TransRAD.
TransRAD:用于增强雷达目标检测的保留视觉转换器
尽管在自动驾驶和智能机器人的环境感知能力方面取得了重大进展,但众所周知,摄像头和激光雷达在弱光条件和恶劣天气下仍然不可靠,这限制了它们的有效性。雷达作为一种可靠的低成本传感器,可以有效地弥补这些限制。然而,由于雷达数据的固有弱点,如低分辨率、高噪声和缺乏视觉信息,基于雷达的目标检测尚未得到充分的探索。在本文中,我们介绍了TransRAD,一种新的三维雷达目标检测模型,旨在通过利用保留视觉转换器(RMT)更有效地从信息密集雷达距离-方位-多普勒(RAD)数据中学习特征来解决这些挑战。我们的方法利用RMT提供的保留曼哈顿自注意(MaSA)机制来整合明确的空间先验,从而能够更准确地与RAD数据中雷达目标的空间显著性特征对齐,并实现跨RAD维度的精确三维雷达探测。此外,我们提出了位置感知非最大抑制(LA-NMS),以有效缓解深度雷达目标检测中常见的重复边界框问题。实验结果表明,TransRAD在二维和三维雷达探测任务中都优于最先进的SOTA方法,实现了更高的精度、更快的推理速度和更低的计算复杂度。代码可从https://github.com/radar-lab/TransRAD获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信