Learning to Optimize State Estimation in Multi-Agent Reinforcement Learning-Based Collaborative Detection

IF 7.7 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Mobile Computing Pub Date : 2024-08-19 DOI:10.1109/TMC.2024.3445583

Tianlong Zhou;Tianyi Shi;Hongye Gao;Weixiong Rao

{"title":"Learning to Optimize State Estimation in Multi-Agent Reinforcement Learning-Based Collaborative Detection","authors":"Tianlong Zhou;Tianyi Shi;Hongye Gao;Weixiong Rao","doi":"10.1109/TMC.2024.3445583","DOIUrl":null,"url":null,"abstract":"In this paper, we study the collaborative detection problem in a multi-agent environment. By exploiting onboard range-bearing sensors, mobile agents make sequential control decisions such as moving directions to gather information of movable targets. To estimate target states, i.e., target location and velocity, the classic works such as Kalman Filter (KF) and Extended Kalman Filter (EKF) impractically assume that the underlying state space model is fully known, and some recent learning-based works, i.e., KalmanNet, estimate target states alone but without estimation uncertainty, and cannot make robust control decision. To tackle such issues, we first propose a neural network-based state estimator, namely T\n<underline>W\no-phase K\n<underline>AL\nma\n<underline>n\n Filter with \n<underline>U\nncertainty quan\n<underline>T\nification (WALNUT), to explicitly give both target states and estimation uncertainty. The developed multi-agent reinforcement learning (MARL) model then takes the learned target states and uncertainty as input and makes robust actions to track movable targets. Our extensive experiments demonstrate that our work outperforms the state-of-the-art by higher tracking ability and lower localization error.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"23 12","pages":"14330-14343"},"PeriodicalIF":7.7000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10638830/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

In this paper, we study the collaborative detection problem in a multi-agent environment. By exploiting onboard range-bearing sensors, mobile agents make sequential control decisions such as moving directions to gather information of movable targets. To estimate target states, i.e., target location and velocity, the classic works such as Kalman Filter (KF) and Extended Kalman Filter (EKF) impractically assume that the underlying state space model is fully known, and some recent learning-based works, i.e., KalmanNet, estimate target states alone but without estimation uncertainty, and cannot make robust control decision. To tackle such issues, we first propose a neural network-based state estimator, namely T W o-phase K AL ma n Filter with U ncertainty quan T ification (WALNUT), to explicitly give both target states and estimation uncertainty. The developed multi-agent reinforcement learning (MARL) model then takes the learned target states and uncertainty as input and makes robust actions to track movable targets. Our extensive experiments demonstrate that our work outperforms the state-of-the-art by higher tracking ability and lower localization error.

查看原文本刊更多论文

在基于强化学习的多代理协作检测中学习优化状态估计

本文研究了多代理环境中的协同探测问题。通过利用机载测距传感器，移动代理做出顺序控制决策，如移动方向，以收集移动目标的信息。为了估计目标状态，即目标位置和速度，卡尔曼滤波器（KF）和扩展卡尔曼滤波器（EKF）等经典作品都不切实际地假设了底层状态空间模型是完全已知的，而最近一些基于学习的作品，即卡尔曼网络，只能估计目标状态，但没有估计不确定性，无法做出稳健的控制决策。针对这些问题，我们首先提出了一种基于神经网络的状态估计器，即带有不确定性量化的双相卡尔曼滤波器（WALNUT），以明确给出目标状态和估计不确定性。然后，开发的多代理强化学习（MARL）模型将学习到的目标状态和不确定性作为输入，并采取稳健的行动来跟踪移动目标。大量实验证明，我们的研究成果以更高的跟踪能力和更低的定位误差超越了最先进的研究成果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Mobile Computing 工程技术-电信学

CiteScore

12.90

自引率

2.50%

发文量

403

审稿时长

6.6 months

期刊介绍： IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.