{"title":"用于复杂水下环境中自动潜航器三维运动规划的自适应节能强化学习","authors":"Jiayi Wen , Anqing Wang , Jingwei Zhu , Fengbei Xia , Zhouhua Peng , Weidong Zhang","doi":"10.1016/j.oceaneng.2024.119111","DOIUrl":null,"url":null,"abstract":"<div><p>This paper addresses the problem of 3D motion planning for autonomous underwater vehicles (AUVs) in complex underwater environments where prior environmental information is unavailable. A policy-feature-based state-dependent-exploration soft actor-critic (PSDE-SAC) framework integrating prioritized experience relay (PER) mechanism is developed for energy-efficient AUV underwater navigation. Specifically, a generalized exponential-based energy consumption model is firstly constructed to enable accurate calculation of energy consumption between any two points in a 3D underwater environment regardless of environmental disturbances. Then, an adaptive reward function with adjustable weights is designed to balance energy consumption and travel distance. Based on the well-designed reward function, the PSDE-SAC motion planning framework is constructed such that the frequently encountered challenges of erratic motion and restricted exploration in reinforcement learning are addressed. In addition, with the introduction of PER and policy features, the convergence and exploration abilities of the PSDE-SAC framework are significantly enhanced. Simulation results illustrate the superiority of the proposed method against other reinforcement learning algorithms in terms of energy consumption, convergence, and stability.</p></div>","PeriodicalId":19403,"journal":{"name":"Ocean Engineering","volume":"312 ","pages":"Article 119111"},"PeriodicalIF":5.5000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive energy-efficient reinforcement learning for AUV 3D motion planning in complex underwater environments\",\"authors\":\"Jiayi Wen , Anqing Wang , Jingwei Zhu , Fengbei Xia , Zhouhua Peng , Weidong Zhang\",\"doi\":\"10.1016/j.oceaneng.2024.119111\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This paper addresses the problem of 3D motion planning for autonomous underwater vehicles (AUVs) in complex underwater environments where prior environmental information is unavailable. A policy-feature-based state-dependent-exploration soft actor-critic (PSDE-SAC) framework integrating prioritized experience relay (PER) mechanism is developed for energy-efficient AUV underwater navigation. Specifically, a generalized exponential-based energy consumption model is firstly constructed to enable accurate calculation of energy consumption between any two points in a 3D underwater environment regardless of environmental disturbances. Then, an adaptive reward function with adjustable weights is designed to balance energy consumption and travel distance. Based on the well-designed reward function, the PSDE-SAC motion planning framework is constructed such that the frequently encountered challenges of erratic motion and restricted exploration in reinforcement learning are addressed. In addition, with the introduction of PER and policy features, the convergence and exploration abilities of the PSDE-SAC framework are significantly enhanced. Simulation results illustrate the superiority of the proposed method against other reinforcement learning algorithms in terms of energy consumption, convergence, and stability.</p></div>\",\"PeriodicalId\":19403,\"journal\":{\"name\":\"Ocean Engineering\",\"volume\":\"312 \",\"pages\":\"Article 119111\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ocean Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0029801824024491\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ocean Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0029801824024491","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0
摘要
本文探讨了自主水下航行器(AUV)在复杂水下环境中的三维运动规划问题,因为在这种环境中无法获得先验环境信息。本文开发了一种基于策略特征的状态依赖探索软行为批判(PSDE-SAC)框架,其中集成了优先经验中继(PER)机制,用于实现高能效的 AUV 水下导航。具体来说,首先构建了一个基于指数的广义能耗模型,以准确计算三维水下环境中任意两点之间的能耗,而不受环境干扰的影响。然后,设计一个权重可调的自适应奖励函数,以平衡能量消耗和行进距离。基于精心设计的奖励函数,构建了 PSDE-SAC 运动规划框架,从而解决了强化学习中经常遇到的运动不稳定和探索受限的难题。此外,随着 PER 和策略特征的引入,PSDE-SAC 框架的收敛性和探索能力得到了显著增强。仿真结果表明,与其他强化学习算法相比,所提出的方法在能耗、收敛性和稳定性方面都更胜一筹。
Adaptive energy-efficient reinforcement learning for AUV 3D motion planning in complex underwater environments
This paper addresses the problem of 3D motion planning for autonomous underwater vehicles (AUVs) in complex underwater environments where prior environmental information is unavailable. A policy-feature-based state-dependent-exploration soft actor-critic (PSDE-SAC) framework integrating prioritized experience relay (PER) mechanism is developed for energy-efficient AUV underwater navigation. Specifically, a generalized exponential-based energy consumption model is firstly constructed to enable accurate calculation of energy consumption between any two points in a 3D underwater environment regardless of environmental disturbances. Then, an adaptive reward function with adjustable weights is designed to balance energy consumption and travel distance. Based on the well-designed reward function, the PSDE-SAC motion planning framework is constructed such that the frequently encountered challenges of erratic motion and restricted exploration in reinforcement learning are addressed. In addition, with the introduction of PER and policy features, the convergence and exploration abilities of the PSDE-SAC framework are significantly enhanced. Simulation results illustrate the superiority of the proposed method against other reinforcement learning algorithms in terms of energy consumption, convergence, and stability.
期刊介绍:
Ocean Engineering provides a medium for the publication of original research and development work in the field of ocean engineering. Ocean Engineering seeks papers in the following topics.