基于深度强化学习的发射器一级推进着陆

IF 3.1 2区 物理与天体物理 Q1 ENGINEERING, AEROSPACE
Davide Iafrate , Andrea Brandonisio , Robert Hinz , Michèle Lavagna
{"title":"基于深度强化学习的发射器一级推进着陆","authors":"Davide Iafrate ,&nbsp;Andrea Brandonisio ,&nbsp;Robert Hinz ,&nbsp;Michèle Lavagna","doi":"10.1016/j.actaastro.2024.11.028","DOIUrl":null,"url":null,"abstract":"<div><div>The planetary landing problem is gaining relevance in the space sector, spanning a wide range of applications from unmanned probes landing on other planetary bodies to reusable first and second stages of launcher vehicles. In the existing methodology there is a lack of flexibility in handling complex non-linear dynamics, in particular in the case of non-convexifiable constraints. It is therefore crucial to assess the performance of novel techniques and their advantages and disadvantages. The purpose of this work is the development of an integrated 6-DOF guidance and control approach based on reinforcement learning of deep neural network policies for fuel-optimal planetary landing control, specifically with application to a launcher first-stage terminal landing, and the assessment of its performance and robustness. 3-DOF and 6-DOF simulators are developed and encapsulated in MDP-like (Markov Decision Process) industry-standard compatible environments. Particular care is given in thoroughly shaping reward functions capable of achieving the landing both successfully and in a fuel-optimal manner. A cloud pipeline for effective training of an agent using a PPO reinforcement learning algorithm to successfully achieve the landing goal is developed.</div></div>","PeriodicalId":44971,"journal":{"name":"Acta Astronautica","volume":"227 ","pages":"Pages 40-56"},"PeriodicalIF":3.1000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Propulsive landing of launchers’ first stages with Deep Reinforcement Learning\",\"authors\":\"Davide Iafrate ,&nbsp;Andrea Brandonisio ,&nbsp;Robert Hinz ,&nbsp;Michèle Lavagna\",\"doi\":\"10.1016/j.actaastro.2024.11.028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The planetary landing problem is gaining relevance in the space sector, spanning a wide range of applications from unmanned probes landing on other planetary bodies to reusable first and second stages of launcher vehicles. In the existing methodology there is a lack of flexibility in handling complex non-linear dynamics, in particular in the case of non-convexifiable constraints. It is therefore crucial to assess the performance of novel techniques and their advantages and disadvantages. The purpose of this work is the development of an integrated 6-DOF guidance and control approach based on reinforcement learning of deep neural network policies for fuel-optimal planetary landing control, specifically with application to a launcher first-stage terminal landing, and the assessment of its performance and robustness. 3-DOF and 6-DOF simulators are developed and encapsulated in MDP-like (Markov Decision Process) industry-standard compatible environments. Particular care is given in thoroughly shaping reward functions capable of achieving the landing both successfully and in a fuel-optimal manner. A cloud pipeline for effective training of an agent using a PPO reinforcement learning algorithm to successfully achieve the landing goal is developed.</div></div>\",\"PeriodicalId\":44971,\"journal\":{\"name\":\"Acta Astronautica\",\"volume\":\"227 \",\"pages\":\"Pages 40-56\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-11-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Astronautica\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0094576524006751\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, AEROSPACE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Astronautica","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0094576524006751","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}
引用次数: 0

摘要

行星着陆问题在空间领域的相关性越来越大,涉及从无人探测器在其他行星体上着陆到可重复使用的运载火箭的第一级和第二级的广泛应用。在现有的方法中,在处理复杂的非线性动力学方面缺乏灵活性,特别是在非凸化约束的情况下。因此,评估新技术的性能及其优缺点是至关重要的。这项工作的目的是开发一种基于深度神经网络策略强化学习的集成六自由度制导和控制方法,用于燃料最优行星着陆控制,特别是应用于发射器一级终端着陆,并评估其性能和鲁棒性。3-DOF和6-DOF模拟器开发和封装在mdp(马尔可夫决策过程)行业标准兼容环境。特别注意的是,要彻底塑造能够成功着陆并以燃料最优方式着陆的奖励函数。提出了一种利用PPO强化学习算法对智能体进行有效训练以成功实现着陆目标的云管道。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Propulsive landing of launchers’ first stages with Deep Reinforcement Learning
The planetary landing problem is gaining relevance in the space sector, spanning a wide range of applications from unmanned probes landing on other planetary bodies to reusable first and second stages of launcher vehicles. In the existing methodology there is a lack of flexibility in handling complex non-linear dynamics, in particular in the case of non-convexifiable constraints. It is therefore crucial to assess the performance of novel techniques and their advantages and disadvantages. The purpose of this work is the development of an integrated 6-DOF guidance and control approach based on reinforcement learning of deep neural network policies for fuel-optimal planetary landing control, specifically with application to a launcher first-stage terminal landing, and the assessment of its performance and robustness. 3-DOF and 6-DOF simulators are developed and encapsulated in MDP-like (Markov Decision Process) industry-standard compatible environments. Particular care is given in thoroughly shaping reward functions capable of achieving the landing both successfully and in a fuel-optimal manner. A cloud pipeline for effective training of an agent using a PPO reinforcement learning algorithm to successfully achieve the landing goal is developed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Acta Astronautica
Acta Astronautica 工程技术-工程:宇航
CiteScore
7.20
自引率
22.90%
发文量
599
审稿时长
53 days
期刊介绍: Acta Astronautica is sponsored by the International Academy of Astronautics. Content is based on original contributions in all fields of basic, engineering, life and social space sciences and of space technology related to: The peaceful scientific exploration of space, Its exploitation for human welfare and progress, Conception, design, development and operation of space-borne and Earth-based systems, In addition to regular issues, the journal publishes selected proceedings of the annual International Astronautical Congress (IAC), transactions of the IAA and special issues on topics of current interest, such as microgravity, space station technology, geostationary orbits, and space economics. Other subject areas include satellite technology, space transportation and communications, space energy, power and propulsion, astrodynamics, extraterrestrial intelligence and Earth observations.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信