Adjustable behavior-guided adaptive dynamic programming for neural learning control

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2025-03-18 DOI:10.1016/j.neucom.2025.129986

Guohan Tang, Ding Wang, Ao Liu, Junfei Qiao

{"title":"Adjustable behavior-guided adaptive dynamic programming for neural learning control","authors":"Guohan Tang, Ding Wang, Ao Liu, Junfei Qiao","doi":"10.1016/j.neucom.2025.129986","DOIUrl":null,"url":null,"abstract":"<div><div>In this article, an adjustable behavior-guided adaptive dynamic programming (BGADP) algorithm is designed to solve the optimal regulation problem for discrete-time systems. In conventional adaptive dynamic programming methods, gradient information of system dynamics is necessary for conducting policy improvement. However, these methods face challenges when gradient information cannot be computed or when the system dynamics is non-differentiable. To overcome these limitations, a human-behavior-inspired swarm intelligence approach is used to search for superior policies during the iterative process, eliminating the need for gradient information. Additionally, a relaxation factor is introduced into the value function update to accelerate the convergence speed of the algorithm. The monotonicity and convergence properties of the iterative value function are rigorously analyzed. Finally, the effectiveness and practicality of the adjustable BGADP algorithm are validated through two simulation studies, which are implemented using the actor–critic framework with neural networks.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"636 ","pages":"Article 129986"},"PeriodicalIF":5.5000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225006587","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In this article, an adjustable behavior-guided adaptive dynamic programming (BGADP) algorithm is designed to solve the optimal regulation problem for discrete-time systems. In conventional adaptive dynamic programming methods, gradient information of system dynamics is necessary for conducting policy improvement. However, these methods face challenges when gradient information cannot be computed or when the system dynamics is non-differentiable. To overcome these limitations, a human-behavior-inspired swarm intelligence approach is used to search for superior policies during the iterative process, eliminating the need for gradient information. Additionally, a relaxation factor is introduced into the value function update to accelerate the convergence speed of the algorithm. The monotonicity and convergence properties of the iterative value function are rigorously analyzed. Finally, the effectiveness and practicality of the adjustable BGADP algorithm are validated through two simulation studies, which are implemented using the actor–critic framework with neural networks.

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.