基于自适应动态规划的自主水下航行器运动控制

2018 5th International Conference on Control, Decision and Information Technologies (CoDIT) Pub Date : 2018-04-10 DOI:10.1109/CoDIT.2018.8394934

Siddhant Vibhute

{"title":"基于自适应动态规划的自主水下航行器运动控制","authors":"Siddhant Vibhute","doi":"10.1109/CoDIT.2018.8394934","DOIUrl":null,"url":null,"abstract":"In this paper, Adaptive Dynamic Programming (ADP) technique is utilized to achieve optimal motion control of Autonomous Underwater Vehicle (AUV) System. The paper proposes a model-free based method that takes into consideration the actuator input and obstacle position while tracing an optimal path. The concept of machine learning enables to develop a path-planner which aims to avoid collisions with static obstacles. The ADP approach is realized to approximate the solution of the cost functional for optimization purpose by which the positions of the locally situated obstacles need not be priori-known until they are within a designed approximation safety envelope. The methodology is implemented to achieve the path-planning objective using dynamic programming technique. The Least-squares policy method serves as a recursive algorithm to approximate the value function for the domain, providing an approach for the finite space discrete control system. The concept behind the design of an obstacle-free path finder is to generate an optimal action that minimizes the local cost, defined by a functional, under constrained optimization. The most advantageous value function is described by the Hamilton Jacobi Bellman (HJB) equation, that is impractical to solve using analytical methods. To overcome the complex calculations subject to HJB, a method based on Reinforcement Learning (RL), called ADP is implemented. This paper outlines the concept of machine learning to realize a real time obstacle avoidance system.","PeriodicalId":128011,"journal":{"name":"2018 5th International Conference on Control, Decision and Information Technologies (CoDIT)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Adaptive Dynamic Programming Based Motion Control of Autonomous Underwater Vehicles\",\"authors\":\"Siddhant Vibhute\",\"doi\":\"10.1109/CoDIT.2018.8394934\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, Adaptive Dynamic Programming (ADP) technique is utilized to achieve optimal motion control of Autonomous Underwater Vehicle (AUV) System. The paper proposes a model-free based method that takes into consideration the actuator input and obstacle position while tracing an optimal path. The concept of machine learning enables to develop a path-planner which aims to avoid collisions with static obstacles. The ADP approach is realized to approximate the solution of the cost functional for optimization purpose by which the positions of the locally situated obstacles need not be priori-known until they are within a designed approximation safety envelope. The methodology is implemented to achieve the path-planning objective using dynamic programming technique. The Least-squares policy method serves as a recursive algorithm to approximate the value function for the domain, providing an approach for the finite space discrete control system. The concept behind the design of an obstacle-free path finder is to generate an optimal action that minimizes the local cost, defined by a functional, under constrained optimization. The most advantageous value function is described by the Hamilton Jacobi Bellman (HJB) equation, that is impractical to solve using analytical methods. To overcome the complex calculations subject to HJB, a method based on Reinforcement Learning (RL), called ADP is implemented. This paper outlines the concept of machine learning to realize a real time obstacle avoidance system.\",\"PeriodicalId\":128011,\"journal\":{\"name\":\"2018 5th International Conference on Control, Decision and Information Technologies (CoDIT)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 5th International Conference on Control, Decision and Information Technologies (CoDIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CoDIT.2018.8394934\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 5th International Conference on Control, Decision and Information Technologies (CoDIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CoDIT.2018.8394934","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

本文利用自适应动态规划(ADP)技术实现自主水下航行器(AUV)系统的最优运动控制。本文提出了一种基于无模型的方法，该方法在跟踪最优路径时考虑了致动器输入和障碍物位置。机器学习的概念使开发路径规划器能够避免与静态障碍物碰撞。采用ADP方法逼近成本函数的解，使局部障碍物的位置在设计的近似安全包络内不需要知道优先级，从而达到优化目的。利用动态规划技术实现了路径规划的目标。最小二乘策略方法作为一种递归算法来逼近域的值函数，为有限空间离散控制系统提供了一种方法。无障碍寻径器设计背后的概念是，在约束优化下，生成一个最优行为，使局部成本最小化，由函数定义。最有利的价值函数是由Hamilton Jacobi Bellman (HJB)方程描述的，用解析方法求解是不切实际的。为了克服HJB的复杂计算，采用了一种基于强化学习(RL)的方法，称为ADP。本文概述了利用机器学习实现实时避障系统的概念。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Adaptive Dynamic Programming Based Motion Control of Autonomous Underwater Vehicles

In this paper, Adaptive Dynamic Programming (ADP) technique is utilized to achieve optimal motion control of Autonomous Underwater Vehicle (AUV) System. The paper proposes a model-free based method that takes into consideration the actuator input and obstacle position while tracing an optimal path. The concept of machine learning enables to develop a path-planner which aims to avoid collisions with static obstacles. The ADP approach is realized to approximate the solution of the cost functional for optimization purpose by which the positions of the locally situated obstacles need not be priori-known until they are within a designed approximation safety envelope. The methodology is implemented to achieve the path-planning objective using dynamic programming technique. The Least-squares policy method serves as a recursive algorithm to approximate the value function for the domain, providing an approach for the finite space discrete control system. The concept behind the design of an obstacle-free path finder is to generate an optimal action that minimizes the local cost, defined by a functional, under constrained optimization. The most advantageous value function is described by the Hamilton Jacobi Bellman (HJB) equation, that is impractical to solve using analytical methods. To overcome the complex calculations subject to HJB, a method based on Reinforcement Learning (RL), called ADP is implemented. This paper outlines the concept of machine learning to realize a real time obstacle avoidance system.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 5th International Conference on Control, Decision and Information Technologies (CoDIT)

自引率

0.00%

发文量