Fuel-aware autonomous docking using RL-augmented MPC rewards for on-orbit refueling

IF 3.4 2区物理与天体物理 Q1 ENGINEERING, AEROSPACE

Acta Astronautica Pub Date : 2025-09-19 DOI:10.1016/j.actaastro.2025.09.046

Mahya Ramezani , M. Amin Alandihallaj , Barış Can Yalçın , Miguel Angel Olivares Mendez , Andreas M. Hein

{"title":"Fuel-aware autonomous docking using RL-augmented MPC rewards for on-orbit refueling","authors":"Mahya Ramezani , M. Amin Alandihallaj , Barış Can Yalçın , Miguel Angel Olivares Mendez , Andreas M. Hein","doi":"10.1016/j.actaastro.2025.09.046","DOIUrl":null,"url":null,"abstract":"<div><div>The operational lifespan of satellites is constrained by finite fuel reserves, limiting their maneuverability and mission duration. On-orbit refueling offers a transformative solution, extending satellite functionality, reducing costs, and enhancing sustainability. However, the precise execution of docking maneuvers remains a critical challenge, exacerbated by fuel sloshing effects in microgravity, which introduce unpredictable disturbances. This study proposes an integrated control framework combining Model Predictive Control (MPC) and Reinforcement Learning (RL) to ensure safe and efficient docking under these dynamic conditions. Initially, a Proximal Policy Optimization (PPO)-based RL control strategy is introduced, leveraging MPC for trajectory optimization. To further enhance adaptability in highly dynamic environments, Soft Actor-Critic (SAC) is incorporated, offering superior sample efficiency and robustness against stochastic disturbances. The proposed SAC-MPC framework effectively mitigates fuel sloshing effects by balancing computational efficiency with predictive accuracy. Experimental validation is conducted in the Zero-G Lab, emulating control scenarios with 3-DoF floating platforms, while high-fidelity numerical simulations extend the study to 6-DoF dynamics with realistic sloshing behavior modeled using OpenFOAM. Comparative results demonstrate that SAC-MPC outperforms conventional RL and MPC-based methods in docking success rate, precision, and control effort. This research establishes a robust foundation for autonomous satellite docking, contributing to the viability of on-orbit refueling missions and the future of sustainable space operations.</div></div>","PeriodicalId":44971,"journal":{"name":"Acta Astronautica","volume":"238 ","pages":"Pages 690-705"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Astronautica","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S009457652500623X","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}

引用次数: 0

Abstract

The operational lifespan of satellites is constrained by finite fuel reserves, limiting their maneuverability and mission duration. On-orbit refueling offers a transformative solution, extending satellite functionality, reducing costs, and enhancing sustainability. However, the precise execution of docking maneuvers remains a critical challenge, exacerbated by fuel sloshing effects in microgravity, which introduce unpredictable disturbances. This study proposes an integrated control framework combining Model Predictive Control (MPC) and Reinforcement Learning (RL) to ensure safe and efficient docking under these dynamic conditions. Initially, a Proximal Policy Optimization (PPO)-based RL control strategy is introduced, leveraging MPC for trajectory optimization. To further enhance adaptability in highly dynamic environments, Soft Actor-Critic (SAC) is incorporated, offering superior sample efficiency and robustness against stochastic disturbances. The proposed SAC-MPC framework effectively mitigates fuel sloshing effects by balancing computational efficiency with predictive accuracy. Experimental validation is conducted in the Zero-G Lab, emulating control scenarios with 3-DoF floating platforms, while high-fidelity numerical simulations extend the study to 6-DoF dynamics with realistic sloshing behavior modeled using OpenFOAM. Comparative results demonstrate that SAC-MPC outperforms conventional RL and MPC-based methods in docking success rate, precision, and control effort. This research establishes a robust foundation for autonomous satellite docking, contributing to the viability of on-orbit refueling missions and the future of sustainable space operations.

查看原文本刊更多论文

使用rl增强的MPC奖励进行在轨加油的燃料感知自主对接

卫星的使用寿命受限于有限的燃料储备，限制了它们的机动性和任务持续时间。在轨加油提供了一种变革性的解决方案，扩展了卫星的功能，降低了成本，提高了可持续性。然而，精确执行对接机动仍然是一个关键的挑战，微重力下燃料晃动效应会带来不可预测的干扰，这加剧了这一挑战。本研究提出了一种结合模型预测控制（MPC）和强化学习（RL）的集成控制框架，以确保在这些动态条件下安全高效地对接。首先，引入了一种基于近端策略优化（PPO）的RL控制策略，利用MPC进行轨迹优化。为了进一步提高在高动态环境中的适应性，软行为者批评家（SAC）被纳入，提供卓越的样本效率和抗随机干扰的鲁棒性。所提出的SAC-MPC框架通过平衡计算效率和预测精度，有效地减轻了燃料晃动的影响。实验验证在零重力实验室进行，模拟了3自由度浮动平台的控制场景，而高保真数值模拟将研究扩展到6自由度动力学，并使用OpenFOAM模拟了真实的晃动行为。对比结果表明，SAC-MPC在对接成功率、精度和控制工作量方面优于传统的RL和基于mpc的方法。这项研究为自主卫星对接奠定了坚实的基础，有助于在轨加油任务的可行性和未来可持续的空间操作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Acta Astronautica 工程技术-工程：宇航

CiteScore

7.20

自引率

22.90%

发文量

599

审稿时长

53 days

期刊介绍： Acta Astronautica is sponsored by the International Academy of Astronautics. Content is based on original contributions in all fields of basic, engineering, life and social space sciences and of space technology related to: The peaceful scientific exploration of space, Its exploitation for human welfare and progress, Conception, design, development and operation of space-borne and Earth-based systems, In addition to regular issues, the journal publishes selected proceedings of the annual International Astronautical Congress (IAC), transactions of the IAA and special issues on topics of current interest, such as microgravity, space station technology, geostationary orbits, and space economics. Other subject areas include satellite technology, space transportation and communications, space energy, power and propulsion, astrodynamics, extraterrestrial intelligence and Earth observations.