Multi-player pursuit-evasion differential game with equal speed

2017 International Automatic Control Conference (CACS) Pub Date : 2017-11-01 DOI:10.1109/CACS.2017.8284276

Ahmad A. Al-Talabi

引用次数: 4

Abstract

This paper suggests a particular form of a reward function for the fuzzy actor-critic learning Automaton (FACLA) algorithm to learn a team of pursuers how to capture a single evader. It is assumed that all the pursuers and the evader have similar speed. The FACLA algorithm with the suggested reward function formulation can be used in a decentralized manner. Each pursuer should learn how to take the right actions by tuning its fuzzy logic controller (FLC) parameters using FACLA algorithm. For the FACLA, the suggested reward function enables each pursuer to update the corresponding value function accurately. The suggested reward function depends on two factors to learn each pursuer how to participate in capturing the evader. The first depends on the difference in the line-of-sight (LOS) between each pursuer in the game and the evader at two consecutive time instant. The second factor depends on the difference between two consecutive Euclidean distance between each pursuer in the game and the evader. Simulation results are given to validate the FACLA with the suggested reward function.

查看原文本刊更多论文

等速多人追逃差分游戏

本文提出了一种特殊形式的奖励函数，用于模糊演员-评论家学习自动机(FACLA)算法，以学习一组追捕者如何捕获单个逃逃者。假设所有的追赶者和逃避者都有相同的速度。具有建议的奖励函数公式的FACLA算法可以以分散的方式使用。每个跟踪器应该学习如何通过使用FACLA算法调整其模糊逻辑控制器(FLC)参数来采取正确的行动。对于FACLA，建议的奖励函数使每个追求者能够准确地更新相应的价值函数。建议的奖励函数取决于两个因素来学习每个追捕者如何参与捕获逃避者。第一个取决于游戏中每个追击者和逃避者在连续两个时间瞬间的视距(LOS)差异。第二个因素取决于游戏中每个追捕者和逃避者之间连续两次欧氏距离的差值。仿真结果验证了所提奖励函数的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 International Automatic Control Conference (CACS)

自引率

0.00%

发文量