Diva Kartika Larasati, Larasmoyo Nugroho, S. Wijaya, R. Andiarti, Rini Akmeliawati, P. Prajitno, Ery Fitrianingsih
{"title":"基于遗传算法的火箭垂直降落案例强化学习控制器优化","authors":"Diva Kartika Larasati, Larasmoyo Nugroho, S. Wijaya, R. Andiarti, Rini Akmeliawati, P. Prajitno, Ery Fitrianingsih","doi":"10.1109/ICARES56907.2022.9992304","DOIUrl":null,"url":null,"abstract":"A reward function in reinforcement learning is the formalization of the objective. Finding the ideal reward function is a challenge, that needs a search strategy to be constructed. Genetic Algorithm is a suitable approach for reward function search due to its thoroughness. The Deep Deterministic Policy Gradient (DDPG) algorithm, which is the focus of this research, is a reinforcement learning-based controller which performances are improved after the Genetic Algorithms optimizes the agent's reward functions. The optimized controller results in narrower missed distance and lower landing velocity compared to referenced DDPG controller, and significantly less fuel consumption compared to PID.","PeriodicalId":252801,"journal":{"name":"2022 IEEE International Conference on Aerospace Electronics and Remote Sensing Technology (ICARES)","volume":"46 Suppl 7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Genetic Algorithms Optimization of a Reinforcement Learning-based Controller for Vertical Landing Rocket Case\",\"authors\":\"Diva Kartika Larasati, Larasmoyo Nugroho, S. Wijaya, R. Andiarti, Rini Akmeliawati, P. Prajitno, Ery Fitrianingsih\",\"doi\":\"10.1109/ICARES56907.2022.9992304\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A reward function in reinforcement learning is the formalization of the objective. Finding the ideal reward function is a challenge, that needs a search strategy to be constructed. Genetic Algorithm is a suitable approach for reward function search due to its thoroughness. The Deep Deterministic Policy Gradient (DDPG) algorithm, which is the focus of this research, is a reinforcement learning-based controller which performances are improved after the Genetic Algorithms optimizes the agent's reward functions. The optimized controller results in narrower missed distance and lower landing velocity compared to referenced DDPG controller, and significantly less fuel consumption compared to PID.\",\"PeriodicalId\":252801,\"journal\":{\"name\":\"2022 IEEE International Conference on Aerospace Electronics and Remote Sensing Technology (ICARES)\",\"volume\":\"46 Suppl 7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Aerospace Electronics and Remote Sensing Technology (ICARES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICARES56907.2022.9992304\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Aerospace Electronics and Remote Sensing Technology (ICARES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICARES56907.2022.9992304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Genetic Algorithms Optimization of a Reinforcement Learning-based Controller for Vertical Landing Rocket Case
A reward function in reinforcement learning is the formalization of the objective. Finding the ideal reward function is a challenge, that needs a search strategy to be constructed. Genetic Algorithm is a suitable approach for reward function search due to its thoroughness. The Deep Deterministic Policy Gradient (DDPG) algorithm, which is the focus of this research, is a reinforcement learning-based controller which performances are improved after the Genetic Algorithms optimizes the agent's reward functions. The optimized controller results in narrower missed distance and lower landing velocity compared to referenced DDPG controller, and significantly less fuel consumption compared to PID.