Train Offline, Refine Online: Improving Cognitive Tracking Radar Performance With Approximate Policy Iteration and Deep Neural Networks

IEEE Transactions on Radar Systems Pub Date : 2024-12-17 DOI:10.1109/TRS.2024.3518954

Brian W. Rybicki;Jill K. Nelson

{"title":"Train Offline, Refine Online: Improving Cognitive Tracking Radar Performance With Approximate Policy Iteration and Deep Neural Networks","authors":"Brian W. Rybicki;Jill K. Nelson","doi":"10.1109/TRS.2024.3518954","DOIUrl":null,"url":null,"abstract":"A cognitive tracking radar continuously acquires, stores, and exploits knowledge from its target environment in order to improve kinematic tracking performance. In this work, we apply a reinforcement learning (RL) technique, API-DNN, based on approximate policy iteration (API) with a deep neural network (DNN) policy to cognitive radar tracking. API-DNN iteratively improves upon an initial base policy using repeated application of rollout and supervised learning. This approach can appropriately balance online versus offline computation in order to improve efficiency and can adapt to changes in problem specification through online replanning. Prior state-of-the-art cognitive radar tracking approaches either rely on sophisticated search procedures with heuristics and carefully selected hyperparameters or deep RL (DRL) agents based on exotic DNN architectures with poorly understood performance guarantees. API-DNN, instead, is based on well-known principles of rollout, Monte Carlo simulation, and basic DNN function approximation. We demonstrate the effectiveness of API-DNN in cognitive radar simulations based on a standard maneuvering target tracking benchmark scenario. We also show how API-DNN can implement online replanning with updated target information.","PeriodicalId":100645,"journal":{"name":"IEEE Transactions on Radar Systems","volume":"3 ","pages":"57-70"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Radar Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10804878/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

A cognitive tracking radar continuously acquires, stores, and exploits knowledge from its target environment in order to improve kinematic tracking performance. In this work, we apply a reinforcement learning (RL) technique, API-DNN, based on approximate policy iteration (API) with a deep neural network (DNN) policy to cognitive radar tracking. API-DNN iteratively improves upon an initial base policy using repeated application of rollout and supervised learning. This approach can appropriately balance online versus offline computation in order to improve efficiency and can adapt to changes in problem specification through online replanning. Prior state-of-the-art cognitive radar tracking approaches either rely on sophisticated search procedures with heuristics and carefully selected hyperparameters or deep RL (DRL) agents based on exotic DNN architectures with poorly understood performance guarantees. API-DNN, instead, is based on well-known principles of rollout, Monte Carlo simulation, and basic DNN function approximation. We demonstrate the effectiveness of API-DNN in cognitive radar simulations based on a standard maneuvering target tracking benchmark scenario. We also show how API-DNN can implement online replanning with updated target information.

查看原文本刊更多论文

离线训练，在线改进：用近似策略迭代和深度神经网络改进认知跟踪雷达性能

认知跟踪雷达不断地从目标环境中获取、存储和利用知识，以提高运动跟踪性能。在这项工作中，我们将基于近似策略迭代（API）和深度神经网络（DNN）策略的强化学习（RL）技术API-DNN应用于认知雷达跟踪。API-DNN通过重复应用rollout和监督学习来迭代改进初始基本策略。这种方法可以适当地平衡在线与离线计算，从而提高效率，并且可以通过在线重新规划来适应问题规范的变化。先前最先进的认知雷达跟踪方法要么依赖于具有启发式和精心选择的超参数的复杂搜索过程，要么依赖于基于外来深度神经网络架构的深度强化学习（DRL）代理，其性能保证鲜为人知。相反，API-DNN基于众所周知的推出、蒙特卡罗模拟和基本DNN函数近似原理。我们在基于标准机动目标跟踪基准场景的认知雷达模拟中证明了API-DNN的有效性。我们还展示了API-DNN如何使用更新的目标信息实现在线重新规划。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Radar Systems

自引率

0.00%

发文量