{"title":"Robust adaptive guidance for autonomous asteroid landing via search-based meta-reinforcement learning","authors":"Zheng Chen, Shuxin Shen, Hutao Cui, Yang Tian","doi":"10.1016/j.actaastro.2025.07.001","DOIUrl":null,"url":null,"abstract":"<div><div>Future asteroid missions require safe landings despite limited prior knowledge and significant uncertainties, posing a critical challenge to current autonomous guidance strategies. This paper introduces a novel robust adaptive guidance framework that integrates meta-reinforcement learning with Monte Carlo Tree Search (MCTS) to enable both rapid learning and efficient adaptation to diverse asteroids. The framework leverages a recurrent network within its meta-reinforcement learning architecture to perceive and respond to dynamic system parameters, ensuring adaptability across varied mission scenarios. The network is trained via an MCTS-based optimization algorithm, where the tree search enhances policy exploration and effectively handles the high-latency rewards of the landing task. Moreover, we introduce an enhanced MCTS by incorporating double progressive widening modifications to refine the deployed action policies. Numerical simulations demonstrate the proposed framework’s superior performance and robustness in achieving reliable landing guidance across a wide range of environmental uncertainties.</div></div>","PeriodicalId":44971,"journal":{"name":"Acta Astronautica","volume":"236 ","pages":"Pages 723-734"},"PeriodicalIF":3.4000,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Astronautica","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0094576525004229","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}
引用次数: 0
Abstract
Future asteroid missions require safe landings despite limited prior knowledge and significant uncertainties, posing a critical challenge to current autonomous guidance strategies. This paper introduces a novel robust adaptive guidance framework that integrates meta-reinforcement learning with Monte Carlo Tree Search (MCTS) to enable both rapid learning and efficient adaptation to diverse asteroids. The framework leverages a recurrent network within its meta-reinforcement learning architecture to perceive and respond to dynamic system parameters, ensuring adaptability across varied mission scenarios. The network is trained via an MCTS-based optimization algorithm, where the tree search enhances policy exploration and effectively handles the high-latency rewards of the landing task. Moreover, we introduce an enhanced MCTS by incorporating double progressive widening modifications to refine the deployed action policies. Numerical simulations demonstrate the proposed framework’s superior performance and robustness in achieving reliable landing guidance across a wide range of environmental uncertainties.
期刊介绍:
Acta Astronautica is sponsored by the International Academy of Astronautics. Content is based on original contributions in all fields of basic, engineering, life and social space sciences and of space technology related to:
The peaceful scientific exploration of space,
Its exploitation for human welfare and progress,
Conception, design, development and operation of space-borne and Earth-based systems,
In addition to regular issues, the journal publishes selected proceedings of the annual International Astronautical Congress (IAC), transactions of the IAA and special issues on topics of current interest, such as microgravity, space station technology, geostationary orbits, and space economics. Other subject areas include satellite technology, space transportation and communications, space energy, power and propulsion, astrodynamics, extraterrestrial intelligence and Earth observations.