Quadrotor Motion Control Using Deep Reinforcement Learning

IF 1.3 Q3 REMOTE SENSING

Journal of Unmanned Vehicle Systems Pub Date : 2021-09-27 DOI:10.1139/juvs-2021-0010

Zifei Jiang, Alan Francis Lynch

引用次数: 4

Abstract

We present a deep neural net-based controller trained by a model-free reinforcement learning (RL) algorithm to achieve hover stabilization for a quadrotor unmanned aerial vehicle (UAV). With RL, two neural nets are trained. One neural net is used as a stochastic controller which gives the distribution of control inputs. The other maps the UAV state to a scalar which estimates the reward of the controller. A proximal policy optimization (PPO) method, which is an actor-critic policy gradient approach, is used to train the neural nets. Simulation results show that the trained controller achieves a comparable level of performance to a manually-tuned PID controller, despite not depending on any model information. The paper considers different choices of reward function and their influence on controller performance.

查看原文本刊更多论文

基于深度强化学习的四旋翼运动控制

我们提出了一种通过无模型强化学习（RL）算法训练的基于深度神经网络的控制器，以实现四旋翼无人机的悬停稳定。用RL训练两个神经网络。使用一个神经网络作为随机控制器，给出控制输入的分布。另一种将无人机状态映射到标量，该标量估计控制器的奖励。使用一种近似策略优化（PPO）方法来训练神经网络，该方法是一种行动者-评论家策略梯度方法。仿真结果表明，尽管不依赖于任何模型信息，但训练后的控制器实现了与手动调节PID控制器相当的性能水平。本文考虑了奖励函数的不同选择及其对控制器性能的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Unmanned Vehicle Systems REMOTE SENSING-

CiteScore

5.30

自引率

0.00%

发文量