Shun Nakasone, R. Galluzzi, Rogelio Bustamante-Bello
{"title":"Attitude Control for Quadcopters using Reinforcement Learning","authors":"Shun Nakasone, R. Galluzzi, Rogelio Bustamante-Bello","doi":"10.1109/ISEM55847.2022.9976737","DOIUrl":null,"url":null,"abstract":"In this paper, a novel control strategy based on Reinforcement Learning is presented to achieve better performance of attitude control for quadcopters. By using Proximal Policy Optimization, the agent is trained via a reward function and interaction with the environment. The control algorithm obtained from this training process is simulated and tested against proportional-integral-derivative control, being the most common attitude control algorithm used in drone races. The resulting control policies were comparable to the baseline counterpart and, in some cases, outperformed it in terms of noise rejection and robustness to external disturbances.","PeriodicalId":310452,"journal":{"name":"2022 International Symposium on Electromobility (ISEM)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Symposium on Electromobility (ISEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISEM55847.2022.9976737","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, a novel control strategy based on Reinforcement Learning is presented to achieve better performance of attitude control for quadcopters. By using Proximal Policy Optimization, the agent is trained via a reward function and interaction with the environment. The control algorithm obtained from this training process is simulated and tested against proportional-integral-derivative control, being the most common attitude control algorithm used in drone races. The resulting control policies were comparable to the baseline counterpart and, in some cases, outperformed it in terms of noise rejection and robustness to external disturbances.