Heuristic dynamic programming for mobile robot path planning based on Dyna approach

2016 International Joint Conference on Neural Networks (IJCNN) Pub Date : 2016-07-24 DOI:10.1109/IJCNN.2016.7727679

Seaar Al Dabooni, D. Wunsch

引用次数: 15

Abstract

This paper presents a direct heuristic dynamic programming (HDP) based on Dyna planning (Dyna_HDP) for online model learning in a Markov decision process. This novel technique is composed of HDP policy learning to construct the Dyna agent for speeding up the learning time. We evaluate Dyna_HDP on a differential-drive wheeled mobile robot navigation problem in a 2D maze. The simulation is introduced to compare Dyna_HDP with other traditional reinforcement learning algorithms, namely one step Q-learning, Sarsa (λ), and Dyna_Q, under the same benchmark conditions. We demonstrate that Dyna_HDP has a faster near-optimal path than other algorithms, with high stability. In addition, we also confirm that the Dyna_HDP method can be applied in a multi-robot path planning problem. The virtual common environment model is learned from sharing the robots' experiences which significantly reduces the learning time.

查看原文本刊更多论文

基于Dyna方法的启发式动态规划移动机器人路径规划

提出了一种基于动态规划的直接启发式动态规划(Dyna_HDP)，用于马尔可夫决策过程的在线模型学习。该方法通过HDP策略学习来构建动态代理，加快了学习速度。我们评估Dyna_HDP在一个差速驱动轮式移动机器人在二维迷宫中的导航问题。在相同的基准条件下，通过仿真比较了Dyna_HDP与其他传统的强化学习算法，即一步q学习、Sarsa (λ)和Dyna_Q。我们证明了Dyna_HDP具有比其他算法更快的近最优路径，并且具有较高的稳定性。此外，我们还证实了Dyna_HDP方法可以应用于多机器人路径规划问题。虚拟公共环境模型是通过共享机器人的经验来学习的，大大缩短了学习时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 International Joint Conference on Neural Networks (IJCNN)

自引率

0.00%

发文量