Representation Learning for Place Recognition Using MIMO Radar

IF 4.6 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Open Journal of Intelligent Transportation Systems Pub Date : 2025-02-18 DOI:10.1109/OJITS.2025.3543286

Prashant Kumar Rai;Nataliya Strokina;Reza Ghabcheloo

{"title":"Representation Learning for Place Recognition Using MIMO Radar","authors":"Prashant Kumar Rai;Nataliya Strokina;Reza Ghabcheloo","doi":"10.1109/OJITS.2025.3543286","DOIUrl":null,"url":null,"abstract":"Traditional radar perception often rely on point clouds derived from radar heatmap using CFAR filtering, which can result in the loss of valuable information, especially weaker signals crucial for accurate perception. To address this, we present a novel approach for representation learning directly from pre-CFAR heatmaps, specifically for place recognition using a high-resolution MIMO radar sensor. By avoiding CFAR filtering, our method preserves richer contextual data, capturing finer details essential for identifying and matching distinctive features across locations. Pre-CFAR heatmaps, however, contain inherent noise and clutter, complicating their application in radar perception tasks. To overcome this, we propose a self-supervised network that learns robust latent features from noisy heatmaps. The architecture consists of two identical U-Net encoders that extract features from the pair of radar scans, which are then processed by a transformer encoder to estimate ego-motion. Ground truth ego-motion trajectories guide the network training using a weighted mean-square error loss. The latent feature representations from the trained encoders are used to create a database of feature vectors for previously visited locations. During runtime, for place recognition and loop closure detection, cosine similarity is applied to query scan feature representation and the database to find the closest matches. We also introduce data augmentation techniques to handle limited training data, enhancing the model’s generalization capability. Our approach, tested on the publicly available Coloradar dataset and our own, outperforms existing methods, showing significant improvements in place recognition accuracy, particularly in noisy and cluttered environments.","PeriodicalId":100631,"journal":{"name":"IEEE Open Journal of Intelligent Transportation Systems","volume":"6 ","pages":"144-154"},"PeriodicalIF":4.6000,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10891700","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of Intelligent Transportation Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10891700/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Traditional radar perception often rely on point clouds derived from radar heatmap using CFAR filtering, which can result in the loss of valuable information, especially weaker signals crucial for accurate perception. To address this, we present a novel approach for representation learning directly from pre-CFAR heatmaps, specifically for place recognition using a high-resolution MIMO radar sensor. By avoiding CFAR filtering, our method preserves richer contextual data, capturing finer details essential for identifying and matching distinctive features across locations. Pre-CFAR heatmaps, however, contain inherent noise and clutter, complicating their application in radar perception tasks. To overcome this, we propose a self-supervised network that learns robust latent features from noisy heatmaps. The architecture consists of two identical U-Net encoders that extract features from the pair of radar scans, which are then processed by a transformer encoder to estimate ego-motion. Ground truth ego-motion trajectories guide the network training using a weighted mean-square error loss. The latent feature representations from the trained encoders are used to create a database of feature vectors for previously visited locations. During runtime, for place recognition and loop closure detection, cosine similarity is applied to query scan feature representation and the database to find the closest matches. We also introduce data augmentation techniques to handle limited training data, enhancing the model’s generalization capability. Our approach, tested on the publicly available Coloradar dataset and our own, outperforms existing methods, showing significant improvements in place recognition accuracy, particularly in noisy and cluttered environments.

查看原文本刊更多论文

基于MIMO雷达的位置识别表征学习

传统的雷达感知通常依赖于使用CFAR滤波的雷达热图衍生的点云，这可能导致有价值信息的丢失，特别是对准确感知至关重要的较弱信号。为了解决这个问题，我们提出了一种新的方法，直接从预cfar热图中学习表示，特别是使用高分辨率MIMO雷达传感器进行位置识别。通过避免CFAR过滤，我们的方法保留了更丰富的上下文数据，捕获了识别和匹配不同位置特征所必需的更精细的细节。然而，前cfar热图包含固有的噪声和杂波，使其在雷达感知任务中的应用复杂化。为了克服这个问题，我们提出了一种自监督网络，可以从噪声热图中学习鲁棒潜在特征。该架构由两个相同的U-Net编码器组成，它们从一对雷达扫描中提取特征，然后由变压器编码器处理以估计自我运动。地面真实自我运动轨迹使用加权均方误差损失指导网络训练。来自训练好的编码器的潜在特征表示用于为以前访问过的位置创建特征向量数据库。在运行时进行位置识别和闭环检测时，利用余弦相似度查询扫描特征表示和数据库查找最接近的匹配项。我们还引入了数据增强技术来处理有限的训练数据，增强模型的泛化能力。我们的方法在公开可用的Coloradar数据集和我们自己的数据集上进行了测试，优于现有的方法，在位置识别精度上有了显着提高，特别是在嘈杂和混乱的环境中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Open Journal of Intelligent Transportation Systems

CiteScore

5.40

自引率

0.00%

发文量