Method for applying reinforcement learning to motion planning and control of under-actuated underwater vehicle in unknown non-uniform sea flow

2005 IEEE/RSJ International Conference on Intelligent Robots and Systems Pub Date : 2005-12-05 DOI:10.1109/IROS.2005.1544973

H. Kawano

{"title":"Method for applying reinforcement learning to motion planning and control of under-actuated underwater vehicle in unknown non-uniform sea flow","authors":"H. Kawano","doi":"10.1109/IROS.2005.1544973","DOIUrl":null,"url":null,"abstract":"The development of a practical motion planning and control algorithm for under-actuated robots in an unknown disturbance is a very important issue in robotics research. In the case of under actuated underwater vehicles, developing such an algorithm has been particularly problematic for several reasons. First, not only the kinematical characteristics of the motion but also the dynamical characteristics of the underwater vehicle must be considered in the motion planning calculation. Second, it is very difficult to ascertain the exact distribution of the velocity of non-uniform sea flow around obstacles on the seabed before the mission. Third, the effect of the sea flow on the motion of an underwater vehicle is very large because the speed of sea flow is very high compared with the vehicle's. This paper proposes a new method based on the application of reinforcement learning to solve these problems. Reinforcement learning based on the Markov decision process (MDP) is known to be suitable for acquiring motion control algorithms for robots acting in a stochastic environment with disturbance. However, to apply reinforcement learning method, the robot's motion must be suitably digitized and the learning environment must be equal to the robot's mission environment. This paper introduces a motion digitizing method based on an artificial neuron model and a method for making up for the difference between learning and mission environments. The performance of the proposed algorithm is examined by the dynamical simulation of an under-actuated underwater vehicle cruising in an environment with an obstacle and an unknown non-uniform flow simulated by potential flow.","PeriodicalId":189219,"journal":{"name":"2005 IEEE/RSJ International Conference on Intelligent Robots and Systems","volume":"137 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE/RSJ International Conference on Intelligent Robots and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IROS.2005.1544973","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

The development of a practical motion planning and control algorithm for under-actuated robots in an unknown disturbance is a very important issue in robotics research. In the case of under actuated underwater vehicles, developing such an algorithm has been particularly problematic for several reasons. First, not only the kinematical characteristics of the motion but also the dynamical characteristics of the underwater vehicle must be considered in the motion planning calculation. Second, it is very difficult to ascertain the exact distribution of the velocity of non-uniform sea flow around obstacles on the seabed before the mission. Third, the effect of the sea flow on the motion of an underwater vehicle is very large because the speed of sea flow is very high compared with the vehicle's. This paper proposes a new method based on the application of reinforcement learning to solve these problems. Reinforcement learning based on the Markov decision process (MDP) is known to be suitable for acquiring motion control algorithms for robots acting in a stochastic environment with disturbance. However, to apply reinforcement learning method, the robot's motion must be suitably digitized and the learning environment must be equal to the robot's mission environment. This paper introduces a motion digitizing method based on an artificial neuron model and a method for making up for the difference between learning and mission environments. The performance of the proposed algorithm is examined by the dynamical simulation of an under-actuated underwater vehicle cruising in an environment with an obstacle and an unknown non-uniform flow simulated by potential flow.

查看原文本刊更多论文

未知非均匀海流下欠驱动水下航行器运动规划与控制的强化学习方法

欠驱动机器人在未知干扰下的运动规划与控制算法是机器人研究中的一个重要问题。在驱动不足的水下航行器的情况下，由于几个原因，开发这样的算法特别困难。首先，在进行运动规划计算时，不仅要考虑运动的运动学特性，还要考虑水下航行器的动力学特性。其次，在任务前很难确定海底障碍物周围非均匀海流的准确分布。第三，水流对水下航行器运动的影响非常大，因为与水下航行器相比，海水的速度非常快。本文提出了一种基于强化学习的新方法来解决这些问题。基于马尔可夫决策过程(MDP)的强化学习被认为适合于获取机器人在随机干扰环境中的运动控制算法。然而，要应用强化学习方法，必须对机器人的运动进行适当的数字化，并且学习环境必须与机器人的任务环境一致。本文介绍了一种基于人工神经元模型的运动数字化方法，以及一种弥补学习环境和任务环境差异的方法。通过欠驱动水下航行器在有障碍物和未知非均匀流环境下的动态仿真，验证了该算法的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2005 IEEE/RSJ International Conference on Intelligent Robots and Systems

自引率

0.00%

发文量