The Research of Quadrotor Flight Control Based on Reinforcement Learning and ADP

Xueyuan Li, Wentao Xie, Wentao Zhan
{"title":"The Research of Quadrotor Flight Control Based on Reinforcement Learning and ADP","authors":"Xueyuan Li, Wentao Xie, Wentao Zhan","doi":"10.1109/ICNISC57059.2022.00061","DOIUrl":null,"url":null,"abstract":"This paper studies the application of Lookup-Table reinforcement learning method into the continuous state space control of quadrotor simulator and designs a attitude controller for the quadrotor simulator based on Q-learning; for the improvement of defects concerning difficulty in the learning algorithm's convergence and low efficiency in learning when Q-learning is faced with large-scale and continuous-space optimized decision, the method of kernel approximate dynamic programming is introduced, Kernel-based Least-Squares Policy Iteration (KLSPI) is proposed, and a controller for the quadrotor simulator is designed based on this algorithm. The experiment shows that the reinforcement learning control method is of fast convergence speed, small steady-state error, strong adaptive ability and good control effect; when dealing with the problem of continuous state space, the Least-Squares Policy Iteration can converge better strategies with fewer training data compared with the traditional method of discretizing state space first.","PeriodicalId":286467,"journal":{"name":"2022 8th Annual International Conference on Network and Information Systems for Computers (ICNISC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 8th Annual International Conference on Network and Information Systems for Computers (ICNISC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNISC57059.2022.00061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

This paper studies the application of Lookup-Table reinforcement learning method into the continuous state space control of quadrotor simulator and designs a attitude controller for the quadrotor simulator based on Q-learning; for the improvement of defects concerning difficulty in the learning algorithm's convergence and low efficiency in learning when Q-learning is faced with large-scale and continuous-space optimized decision, the method of kernel approximate dynamic programming is introduced, Kernel-based Least-Squares Policy Iteration (KLSPI) is proposed, and a controller for the quadrotor simulator is designed based on this algorithm. The experiment shows that the reinforcement learning control method is of fast convergence speed, small steady-state error, strong adaptive ability and good control effect; when dealing with the problem of continuous state space, the Least-Squares Policy Iteration can converge better strategies with fewer training data compared with the traditional method of discretizing state space first.
基于强化学习和ADP的四旋翼飞行器飞行控制研究
研究了查找表强化学习方法在四旋翼模拟器连续状态空间控制中的应用,设计了一种基于q学习的四旋翼模拟器姿态控制器;针对q -学习面对大规模连续空间优化决策时学习算法收敛困难、学习效率低的缺陷,引入核近似动态规划方法,提出了基于核的最小二乘策略迭代(KLSPI),并基于该算法设计了四旋翼模拟器控制器。实验表明,强化学习控制方法收敛速度快,稳态误差小,自适应能力强,控制效果好;在处理连续状态空间问题时,与传统的先离散状态空间的方法相比,最小二乘策略迭代可以在训练数据较少的情况下更好地收敛策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信