基于深度强化学习的地月空间准周期轨道自主制导

IF 1.3 4区 工程技术 Q2 ENGINEERING, AEROSPACE
Lorenzo Federici, A. Scorsoglio, Alessandro Zavoli, R. Furfaro
{"title":"基于深度强化学习的地月空间准周期轨道自主制导","authors":"Lorenzo Federici, A. Scorsoglio, Alessandro Zavoli, R. Furfaro","doi":"10.2514/1.a35747","DOIUrl":null,"url":null,"abstract":"This paper investigates the use of reinforcement learning for the fuel-optimal guidance of a spacecraft during a time-free low-thrust transfer between two libration point orbits in the cislunar environment. To this aim, a deep neural network is trained via proximal policy optimization to map any spacecraft state to the optimal control action. A general-purpose reward is used to guide the network toward a fuel-optimal control law, regardless of the specific pair of libration orbits considered and without the use of any ad hoc reward shaping technique. Eventually, the learned control policies are compared with the optimal solutions provided by a direct method in two different mission scenarios, and Monte Carlo simulations are used to assess the policies’ robustness to navigation uncertainties.","PeriodicalId":50048,"journal":{"name":"Journal of Spacecraft and Rockets","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Autonomous Guidance Between Quasiperiodic Orbits in Cislunar Space via Deep Reinforcement Learning\",\"authors\":\"Lorenzo Federici, A. Scorsoglio, Alessandro Zavoli, R. Furfaro\",\"doi\":\"10.2514/1.a35747\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper investigates the use of reinforcement learning for the fuel-optimal guidance of a spacecraft during a time-free low-thrust transfer between two libration point orbits in the cislunar environment. To this aim, a deep neural network is trained via proximal policy optimization to map any spacecraft state to the optimal control action. A general-purpose reward is used to guide the network toward a fuel-optimal control law, regardless of the specific pair of libration orbits considered and without the use of any ad hoc reward shaping technique. Eventually, the learned control policies are compared with the optimal solutions provided by a direct method in two different mission scenarios, and Monte Carlo simulations are used to assess the policies’ robustness to navigation uncertainties.\",\"PeriodicalId\":50048,\"journal\":{\"name\":\"Journal of Spacecraft and Rockets\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2023-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Spacecraft and Rockets\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.2514/1.a35747\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, AEROSPACE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Spacecraft and Rockets","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.2514/1.a35747","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}
引用次数: 0

摘要

本文研究了在地月环境下,航天器在无时低推力轨道间转移时的燃料最优制导问题。为此,通过近端策略优化训练深度神经网络,将航天器的任何状态映射到最优控制动作。使用通用奖励来引导网络走向燃料最优控制律,而不考虑特定的振动轨道对,也不使用任何特殊的奖励塑造技术。最后,在两种不同的任务场景下,将学习到的控制策略与直接方法提供的最优解进行比较,并利用蒙特卡罗仿真来评估策略对导航不确定性的鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Autonomous Guidance Between Quasiperiodic Orbits in Cislunar Space via Deep Reinforcement Learning
This paper investigates the use of reinforcement learning for the fuel-optimal guidance of a spacecraft during a time-free low-thrust transfer between two libration point orbits in the cislunar environment. To this aim, a deep neural network is trained via proximal policy optimization to map any spacecraft state to the optimal control action. A general-purpose reward is used to guide the network toward a fuel-optimal control law, regardless of the specific pair of libration orbits considered and without the use of any ad hoc reward shaping technique. Eventually, the learned control policies are compared with the optimal solutions provided by a direct method in two different mission scenarios, and Monte Carlo simulations are used to assess the policies’ robustness to navigation uncertainties.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Spacecraft and Rockets
Journal of Spacecraft and Rockets 工程技术-工程:宇航
CiteScore
3.60
自引率
18.80%
发文量
185
审稿时长
4.5 months
期刊介绍: This Journal, that started it all back in 1963, is devoted to the advancement of the science and technology of astronautics and aeronautics through the dissemination of original archival research papers disclosing new theoretical developments and/or experimental result. The topics include aeroacoustics, aerodynamics, combustion, fundamentals of propulsion, fluid mechanics and reacting flows, fundamental aspects of the aerospace environment, hydrodynamics, lasers and associated phenomena, plasmas, research instrumentation and facilities, structural mechanics and materials, optimization, and thermomechanics and thermochemistry. Papers also are sought which review in an intensive manner the results of recent research developments on any of the topics listed above.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信