离散时间智能临界控制的进阶值迭代研究

IF 13.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence Review Pub Date : 2023-05-21 DOI:10.1007/s10462-023-10497-1

Mingming Zhao, Ding Wang, Junfei Qiao, Mingming Ha, Jin Ren

{"title":"离散时间智能临界控制的进阶值迭代研究","authors":"Mingming Zhao, Ding Wang, Junfei Qiao, Mingming Ha, Jin Ren","doi":"10.1007/s10462-023-10497-1","DOIUrl":null,"url":null,"abstract":"<div><p>Optimal control problems are ubiquitous in practical engineering applications and social life with the idea of cost or resource conservation. Based on the critic learning scheme, adaptive dynamic programming (ADP) is regarded as a significant avenue to address the optimal control problems by combining the advanced design ideas such as adaptive control, reinforcement learning, and intelligent control. This survey introduces the recent development of ADP and related intelligent critic control with an emphasis on advanced value iteration (VI) schemes for discrete-time nonlinear systems. The theoretical results focus on convergence and stability properties for general VI, stabilizing VI, integrated VI, evolving VI, adjustable VI schemes and so on. Several significant applications are also elaborated in aspects of optimal regulation, optimal tracking, and zero-sum games. We aim to break through the bottleneck problems for VI algorithms in realizing evolving control, accelerating learning speed, and reducing the calculation expense. In addition, the prospects of new theoretical and technical fields for advanced VI schemes are looked ahead.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"56 10","pages":"12315 - 12346"},"PeriodicalIF":13.9000,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Advanced value iteration for discrete-time intelligent critic control: A survey\",\"authors\":\"Mingming Zhao, Ding Wang, Junfei Qiao, Mingming Ha, Jin Ren\",\"doi\":\"10.1007/s10462-023-10497-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Optimal control problems are ubiquitous in practical engineering applications and social life with the idea of cost or resource conservation. Based on the critic learning scheme, adaptive dynamic programming (ADP) is regarded as a significant avenue to address the optimal control problems by combining the advanced design ideas such as adaptive control, reinforcement learning, and intelligent control. This survey introduces the recent development of ADP and related intelligent critic control with an emphasis on advanced value iteration (VI) schemes for discrete-time nonlinear systems. The theoretical results focus on convergence and stability properties for general VI, stabilizing VI, integrated VI, evolving VI, adjustable VI schemes and so on. Several significant applications are also elaborated in aspects of optimal regulation, optimal tracking, and zero-sum games. We aim to break through the bottleneck problems for VI algorithms in realizing evolving control, accelerating learning speed, and reducing the calculation expense. In addition, the prospects of new theoretical and technical fields for advanced VI schemes are looked ahead.</p></div>\",\"PeriodicalId\":8449,\"journal\":{\"name\":\"Artificial Intelligence Review\",\"volume\":\"56 10\",\"pages\":\"12315 - 12346\"},\"PeriodicalIF\":13.9000,\"publicationDate\":\"2023-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence Review\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10462-023-10497-1\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-023-10497-1","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 1

摘要

最优控制问题在实际工程应用和社会生活中普遍存在，其思想是节约成本或资源。自适应动态规划(ADP)是基于批评性学习方案，结合自适应控制、强化学习和智能控制等先进设计思想，解决最优控制问题的重要途径。本文介绍了ADP和相关智能临界控制的最新发展，重点介绍了离散非线性系统的先进值迭代(VI)方案。理论结果集中于一般VI、稳定VI、综合VI、演化VI、可调VI方案等的收敛性和稳定性。在最优调节、最优跟踪和零和博弈方面也阐述了几个重要的应用。我们的目标是突破VI算法在实现进化控制、加快学习速度、降低计算费用等方面的瓶颈问题。展望了先进VI方案的理论和技术新领域。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Advanced value iteration for discrete-time intelligent critic control: A survey

查看原文本刊更多论文

Advanced value iteration for discrete-time intelligent critic control: A survey

Optimal control problems are ubiquitous in practical engineering applications and social life with the idea of cost or resource conservation. Based on the critic learning scheme, adaptive dynamic programming (ADP) is regarded as a significant avenue to address the optimal control problems by combining the advanced design ideas such as adaptive control, reinforcement learning, and intelligent control. This survey introduces the recent development of ADP and related intelligent critic control with an emphasis on advanced value iteration (VI) schemes for discrete-time nonlinear systems. The theoretical results focus on convergence and stability properties for general VI, stabilizing VI, integrated VI, evolving VI, adjustable VI schemes and so on. Several significant applications are also elaborated in aspects of optimal regulation, optimal tracking, and zero-sum games. We aim to break through the bottleneck problems for VI algorithms in realizing evolving control, accelerating learning speed, and reducing the calculation expense. In addition, the prospects of new theoretical and technical fields for advanced VI schemes are looked ahead.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Artificial Intelligence Review 工程技术-计算机：人工智能

CiteScore

22.00

自引率

3.30%

发文量

194

审稿时长

5.3 months

期刊介绍： Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.