通过受限深度强化学习实现多目标雷达跟踪的资源分配

IF 7.4 1区 计算机科学 Q1 TELECOMMUNICATIONS
Ziyang Lu;M. Cenk Gursoy
{"title":"通过受限深度强化学习实现多目标雷达跟踪的资源分配","authors":"Ziyang Lu;M. Cenk Gursoy","doi":"10.1109/TCCN.2023.3304634","DOIUrl":null,"url":null,"abstract":"In this paper, multi-target tracking in a radar system is considered, and adaptive radar resource management is addressed. In particular, time management in tracking multiple maneuvering targets subject to budget constraints is studied with the goal to minimize the total tracking cost of all targets (or equivalently to maximize the tracking accuracies). The constrained optimization of the dwell time allocation to each target is addressed via deep Q-network (DQN) based reinforcement learning. In the proposed constrained deep reinforcement learning (CDRL) algorithm, both the parameters of the DQN and the dual variable are learned simultaneously. The proposed CDRL framework consists of two components, namely online CDRL and offline CDRL. Training a DQN in the deep reinforcement learning algorithm usually requires a large amount of data, which may not be available in a target tracking task due to the scarcity of measurements. We address this challenge by proposing an offline CDRL framework, in which the algorithm evolves in a virtual environment generated based on the current observations and prior knowledge of the environment. Simulation results show that both offline CDRL and online CDRL are critical for effective target tracking and resource utilization. Offline CDRL provides more training data to stabilize the learning process and the online component can sense the change in the environment and make the corresponding adaptation. Furthermore, a hybrid CDRL algorithm that combines offline CDRL and online CDRL is proposed to reduce the computational burden by performing offline CDRL only periodically to stabilize the training process of the online CDRL.","PeriodicalId":13069,"journal":{"name":"IEEE Transactions on Cognitive Communications and Networking","volume":"9 6","pages":"1677-1690"},"PeriodicalIF":7.4000,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Resource Allocation for Multi-Target Radar Tracking via Constrained Deep Reinforcement Learning\",\"authors\":\"Ziyang Lu;M. Cenk Gursoy\",\"doi\":\"10.1109/TCCN.2023.3304634\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, multi-target tracking in a radar system is considered, and adaptive radar resource management is addressed. In particular, time management in tracking multiple maneuvering targets subject to budget constraints is studied with the goal to minimize the total tracking cost of all targets (or equivalently to maximize the tracking accuracies). The constrained optimization of the dwell time allocation to each target is addressed via deep Q-network (DQN) based reinforcement learning. In the proposed constrained deep reinforcement learning (CDRL) algorithm, both the parameters of the DQN and the dual variable are learned simultaneously. The proposed CDRL framework consists of two components, namely online CDRL and offline CDRL. Training a DQN in the deep reinforcement learning algorithm usually requires a large amount of data, which may not be available in a target tracking task due to the scarcity of measurements. We address this challenge by proposing an offline CDRL framework, in which the algorithm evolves in a virtual environment generated based on the current observations and prior knowledge of the environment. Simulation results show that both offline CDRL and online CDRL are critical for effective target tracking and resource utilization. Offline CDRL provides more training data to stabilize the learning process and the online component can sense the change in the environment and make the corresponding adaptation. Furthermore, a hybrid CDRL algorithm that combines offline CDRL and online CDRL is proposed to reduce the computational burden by performing offline CDRL only periodically to stabilize the training process of the online CDRL.\",\"PeriodicalId\":13069,\"journal\":{\"name\":\"IEEE Transactions on Cognitive Communications and Networking\",\"volume\":\"9 6\",\"pages\":\"1677-1690\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2023-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cognitive Communications and Networking\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10215369/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10215369/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

本文考虑了雷达系统中的多目标跟踪,并解决了自适应雷达资源管理问题。特别是研究了在预算约束下跟踪多个机动目标的时间管理,目标是使所有目标的总跟踪成本最小化(或等同于跟踪精度最大化)。通过基于深度 Q 网络(DQN)的强化学习,对每个目标的停留时间分配进行了受限优化。在所提出的约束深度强化学习(CDRL)算法中,DQN 的参数和对偶变量都是同时学习的。拟议的 CDRL 框架由两个部分组成,即在线 CDRL 和离线 CDRL。在深度强化学习算法中训练 DQN 通常需要大量数据,而在目标跟踪任务中,由于测量数据稀缺,可能无法获得这些数据。为了应对这一挑战,我们提出了离线 CDRL 框架,在该框架中,算法在基于当前观测数据和环境先验知识生成的虚拟环境中演化。仿真结果表明,离线 CDRL 和在线 CDRL 对于有效跟踪目标和资源利用都至关重要。离线 CDRL 提供了更多的训练数据来稳定学习过程,而在线组件则可以感知环境的变化并做出相应的调整。此外,还提出了一种结合离线 CDRL 和在线 CDRL 的混合 CDRL 算法,通过定期执行离线 CDRL 来稳定在线 CDRL 的训练过程,从而减轻计算负担。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Resource Allocation for Multi-Target Radar Tracking via Constrained Deep Reinforcement Learning
In this paper, multi-target tracking in a radar system is considered, and adaptive radar resource management is addressed. In particular, time management in tracking multiple maneuvering targets subject to budget constraints is studied with the goal to minimize the total tracking cost of all targets (or equivalently to maximize the tracking accuracies). The constrained optimization of the dwell time allocation to each target is addressed via deep Q-network (DQN) based reinforcement learning. In the proposed constrained deep reinforcement learning (CDRL) algorithm, both the parameters of the DQN and the dual variable are learned simultaneously. The proposed CDRL framework consists of two components, namely online CDRL and offline CDRL. Training a DQN in the deep reinforcement learning algorithm usually requires a large amount of data, which may not be available in a target tracking task due to the scarcity of measurements. We address this challenge by proposing an offline CDRL framework, in which the algorithm evolves in a virtual environment generated based on the current observations and prior knowledge of the environment. Simulation results show that both offline CDRL and online CDRL are critical for effective target tracking and resource utilization. Offline CDRL provides more training data to stabilize the learning process and the online component can sense the change in the environment and make the corresponding adaptation. Furthermore, a hybrid CDRL algorithm that combines offline CDRL and online CDRL is proposed to reduce the computational burden by performing offline CDRL only periodically to stabilize the training process of the online CDRL.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Cognitive Communications and Networking
IEEE Transactions on Cognitive Communications and Networking Computer Science-Artificial Intelligence
CiteScore
15.50
自引率
7.00%
发文量
108
期刊介绍: The IEEE Transactions on Cognitive Communications and Networking (TCCN) aims to publish high-quality manuscripts that push the boundaries of cognitive communications and networking research. Cognitive, in this context, refers to the application of perception, learning, reasoning, memory, and adaptive approaches in communication system design. The transactions welcome submissions that explore various aspects of cognitive communications and networks, focusing on innovative and holistic approaches to complex system design. Key topics covered include architecture, protocols, cross-layer design, and cognition cycle design for cognitive networks. Additionally, research on machine learning, artificial intelligence, end-to-end and distributed intelligence, software-defined networking, cognitive radios, spectrum sharing, and security and privacy issues in cognitive networks are of interest. The publication also encourages papers addressing novel services and applications enabled by these cognitive concepts.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信