{"title":"Reinforcement-Learning-Based Counter Deception for Nonlinear Pursuit–Evasion Game With Incomplete and Asymmetric Information","authors":"Yongkang Wang;Rongxin Cui;Weisheng Yan;Xinxin Guo;Shouxu Zhang;Zhuo Zhang;Zhexuan Zhao","doi":"10.1109/TSMC.2025.3541105","DOIUrl":null,"url":null,"abstract":"In this article, we investigate the problem of capturing a noncooperative target with deception behavior using reinforcement learning (RL) under incomplete information. The pursuer copes not only with its maneuverability constraint but also with the target’s deception behavior, in which the target deliberately conceals its private preference information. The target capture game involving deception behavior is formulated as a nonlinear differential game framework where the information structure is incomplete and asymmetric. The solution to this differential game is proposed based on an RL policy that incorporates critic, actor, and virtual actor neural networks (NNs), when taking into consideration the maneuverability constraint and information structure of the pursuer. Moreover, the states of the constrained adversarial system and the weight errors are proven to be ultimately uniformly bounded (UUB). To counter the deception of the target, we adopt unscented Kalman filter (UKF) to obtain the target intention on energy preference, and integrate it into the pursuer strategy. The feasibility of the proposed strategy and its superiority are verified through comparisons with recent works.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 5","pages":"3261-3274"},"PeriodicalIF":8.6000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man Cybernetics-Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10904288/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In this article, we investigate the problem of capturing a noncooperative target with deception behavior using reinforcement learning (RL) under incomplete information. The pursuer copes not only with its maneuverability constraint but also with the target’s deception behavior, in which the target deliberately conceals its private preference information. The target capture game involving deception behavior is formulated as a nonlinear differential game framework where the information structure is incomplete and asymmetric. The solution to this differential game is proposed based on an RL policy that incorporates critic, actor, and virtual actor neural networks (NNs), when taking into consideration the maneuverability constraint and information structure of the pursuer. Moreover, the states of the constrained adversarial system and the weight errors are proven to be ultimately uniformly bounded (UUB). To counter the deception of the target, we adopt unscented Kalman filter (UKF) to obtain the target intention on energy preference, and integrate it into the pursuer strategy. The feasibility of the proposed strategy and its superiority are verified through comparisons with recent works.
期刊介绍:
The IEEE Transactions on Systems, Man, and Cybernetics: Systems encompasses the fields of systems engineering, covering issue formulation, analysis, and modeling throughout the systems engineering lifecycle phases. It addresses decision-making, issue interpretation, systems management, processes, and various methods such as optimization, modeling, and simulation in the development and deployment of large systems.