Multi-player pursuit-evasion differential game with equal speed

Ahmad A. Al-Talabi
{"title":"Multi-player pursuit-evasion differential game with equal speed","authors":"Ahmad A. Al-Talabi","doi":"10.1109/CACS.2017.8284276","DOIUrl":null,"url":null,"abstract":"This paper suggests a particular form of a reward function for the fuzzy actor-critic learning Automaton (FACLA) algorithm to learn a team of pursuers how to capture a single evader. It is assumed that all the pursuers and the evader have similar speed. The FACLA algorithm with the suggested reward function formulation can be used in a decentralized manner. Each pursuer should learn how to take the right actions by tuning its fuzzy logic controller (FLC) parameters using FACLA algorithm. For the FACLA, the suggested reward function enables each pursuer to update the corresponding value function accurately. The suggested reward function depends on two factors to learn each pursuer how to participate in capturing the evader. The first depends on the difference in the line-of-sight (LOS) between each pursuer in the game and the evader at two consecutive time instant. The second factor depends on the difference between two consecutive Euclidean distance between each pursuer in the game and the evader. Simulation results are given to validate the FACLA with the suggested reward function.","PeriodicalId":185753,"journal":{"name":"2017 International Automatic Control Conference (CACS)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Automatic Control Conference (CACS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CACS.2017.8284276","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

This paper suggests a particular form of a reward function for the fuzzy actor-critic learning Automaton (FACLA) algorithm to learn a team of pursuers how to capture a single evader. It is assumed that all the pursuers and the evader have similar speed. The FACLA algorithm with the suggested reward function formulation can be used in a decentralized manner. Each pursuer should learn how to take the right actions by tuning its fuzzy logic controller (FLC) parameters using FACLA algorithm. For the FACLA, the suggested reward function enables each pursuer to update the corresponding value function accurately. The suggested reward function depends on two factors to learn each pursuer how to participate in capturing the evader. The first depends on the difference in the line-of-sight (LOS) between each pursuer in the game and the evader at two consecutive time instant. The second factor depends on the difference between two consecutive Euclidean distance between each pursuer in the game and the evader. Simulation results are given to validate the FACLA with the suggested reward function.
等速多人追逃差分游戏
本文提出了一种特殊形式的奖励函数,用于模糊演员-评论家学习自动机(FACLA)算法,以学习一组追捕者如何捕获单个逃逃者。假设所有的追赶者和逃避者都有相同的速度。具有建议的奖励函数公式的FACLA算法可以以分散的方式使用。每个跟踪器应该学习如何通过使用FACLA算法调整其模糊逻辑控制器(FLC)参数来采取正确的行动。对于FACLA,建议的奖励函数使每个追求者能够准确地更新相应的价值函数。建议的奖励函数取决于两个因素来学习每个追捕者如何参与捕获逃避者。第一个取决于游戏中每个追击者和逃避者在连续两个时间瞬间的视距(LOS)差异。第二个因素取决于游戏中每个追捕者和逃避者之间连续两次欧氏距离的差值。仿真结果验证了所提奖励函数的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信