{"title":"Adjustable behavior-guided adaptive dynamic programming for neural learning control","authors":"Guohan Tang, Ding Wang, Ao Liu, Junfei Qiao","doi":"10.1016/j.neucom.2025.129986","DOIUrl":null,"url":null,"abstract":"<div><div>In this article, an adjustable behavior-guided adaptive dynamic programming (BGADP) algorithm is designed to solve the optimal regulation problem for discrete-time systems. In conventional adaptive dynamic programming methods, gradient information of system dynamics is necessary for conducting policy improvement. However, these methods face challenges when gradient information cannot be computed or when the system dynamics is non-differentiable. To overcome these limitations, a human-behavior-inspired swarm intelligence approach is used to search for superior policies during the iterative process, eliminating the need for gradient information. Additionally, a relaxation factor is introduced into the value function update to accelerate the convergence speed of the algorithm. The monotonicity and convergence properties of the iterative value function are rigorously analyzed. Finally, the effectiveness and practicality of the adjustable BGADP algorithm are validated through two simulation studies, which are implemented using the actor–critic framework with neural networks.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"636 ","pages":"Article 129986"},"PeriodicalIF":5.5000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225006587","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In this article, an adjustable behavior-guided adaptive dynamic programming (BGADP) algorithm is designed to solve the optimal regulation problem for discrete-time systems. In conventional adaptive dynamic programming methods, gradient information of system dynamics is necessary for conducting policy improvement. However, these methods face challenges when gradient information cannot be computed or when the system dynamics is non-differentiable. To overcome these limitations, a human-behavior-inspired swarm intelligence approach is used to search for superior policies during the iterative process, eliminating the need for gradient information. Additionally, a relaxation factor is introduced into the value function update to accelerate the convergence speed of the algorithm. The monotonicity and convergence properties of the iterative value function are rigorously analyzed. Finally, the effectiveness and practicality of the adjustable BGADP algorithm are validated through two simulation studies, which are implemented using the actor–critic framework with neural networks.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.