Research on game strategy of spacecraft chase and escape based on adaptive augmented random search

Jie Jiao, Yongjie Gou, Wenbo Wu, Binfeng Pan
{"title":"Research on game strategy of spacecraft chase and escape based on adaptive augmented random search","authors":"Jie Jiao, Yongjie Gou, Wenbo Wu, Binfeng Pan","doi":"10.1051/jnwpu/20244210117","DOIUrl":null,"url":null,"abstract":"To solve the problem of the survival differential policy interception between a spacecraft and a non-cooperative target pursuit game, the pursuit game policy is studied based on reinforcement learning, and the adaptive-augmented random search algorithm is proposed. Firstly, to solve the sparse reward problem of sequential decision making, an exploration method based on the spatial perturbation of parameters of the policy is designed, thus accelerating its convergence speed. Secondly, to avoid the possibility of falling into local optimum prematurely, a novelty degree function is designed to guide the policy update, enhancing the efficiency of data utilization. Finally, the effectiveness and advancement of the exploration method are verified with numerical simulations and compared with those of the augmented random search algorithm, the proximal policy optimization algorithm and the deep deterministic policy gradient algorithm.","PeriodicalId":515230,"journal":{"name":"Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University","volume":"279 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1051/jnwpu/20244210117","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

To solve the problem of the survival differential policy interception between a spacecraft and a non-cooperative target pursuit game, the pursuit game policy is studied based on reinforcement learning, and the adaptive-augmented random search algorithm is proposed. Firstly, to solve the sparse reward problem of sequential decision making, an exploration method based on the spatial perturbation of parameters of the policy is designed, thus accelerating its convergence speed. Secondly, to avoid the possibility of falling into local optimum prematurely, a novelty degree function is designed to guide the policy update, enhancing the efficiency of data utilization. Finally, the effectiveness and advancement of the exploration method are verified with numerical simulations and compared with those of the augmented random search algorithm, the proximal policy optimization algorithm and the deep deterministic policy gradient algorithm.
基于自适应增强随机搜索的飞船追逃博弈策略研究
为解决航天器与非合作目标追逐博弈的生存差分策略截获问题,基于强化学习研究了追逐博弈策略,提出了自适应增强随机搜索算法。首先,为了解决顺序决策的稀疏奖励问题,设计了一种基于策略参数空间扰动的探索方法,从而加快了其收敛速度。其次,为避免过早陷入局部最优,设计了新颖度函数来指导策略更新,提高了数据利用效率。最后,通过数值模拟验证了探索方法的有效性和先进性,并与增强随机搜索算法、近似策略优化算法和深度确定性策略梯度算法进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信