零和游戏的折扣稳定自适应批判设计与应用验证

IF 6.4 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS
Jin Ren;Ding Wang;Menghua Li;Junfei Qiao
{"title":"零和游戏的折扣稳定自适应批判设计与应用验证","authors":"Jin Ren;Ding Wang;Menghua Li;Junfei Qiao","doi":"10.1109/TASE.2025.3539772","DOIUrl":null,"url":null,"abstract":"In this paper, an adaptive critic design with performance guarantee is established based on the discounted value iteration algorithm to settle with the optimal regulation problem for discrete-time zero-sum games. Value iteration is implemented to obtain the approximate optimal solutions to the Hamilton-Jacobi-Isaacs equation for nonlinear systems and the game algebraic Riccati equation for linear systems. Then, we focus on system stability affected by the introduction of the discount factor and the admissibility of the policy pairs in the value iteration process. The appropriate selection range of the discount factor and the criteria for ensuring system stability are established to assist in obtaining the stabilized optimal policy pair, which not only makes the cost function converge to the optimal value, but also guarantees the asymptotic stability of the closed-loop system. Finally, practical examples for the power system and the ball-beam system are conducted to demonstrate the effectiveness of the presented method. Note to Practitioners—Since there exist a multitude of dynamic systems with uncertainty and interference, the zero-sum game problems are ubiquitous, especially when dealing with dynamic systems featuring antagonistic properties. As an important research direction in the field of optimal control, zero-sum games usually involve designing policy pairs that can optimize the system performance in the presence of adversarial disturbances. Due to the excellent adaptability, value iteration in adaptive dynamic programming is employed to deal with this kind of issues. In addition to focusing on the optimality of policies, the system stability during the control process is equally significance, where the stability is the premise of all operations. Therefore, we are dedicated to providing guidance on the optimal regulation of discrete-time zero-sum games with performance guarantee, which contributes to obtain the stable optimal policy pair. Theoretical analysis of the stability is provided and the asymptotic stability of the system is ensured, which improves the performance of the designed controller. Furthermore, simulation experiments for practical applications are conducted, which verify the feasibility and effectiveness of the proposed control design.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"11706-11716"},"PeriodicalIF":6.4000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Discounted Stable Adaptive Critic Design for Zero-Sum Games With Application Verifications\",\"authors\":\"Jin Ren;Ding Wang;Menghua Li;Junfei Qiao\",\"doi\":\"10.1109/TASE.2025.3539772\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, an adaptive critic design with performance guarantee is established based on the discounted value iteration algorithm to settle with the optimal regulation problem for discrete-time zero-sum games. Value iteration is implemented to obtain the approximate optimal solutions to the Hamilton-Jacobi-Isaacs equation for nonlinear systems and the game algebraic Riccati equation for linear systems. Then, we focus on system stability affected by the introduction of the discount factor and the admissibility of the policy pairs in the value iteration process. The appropriate selection range of the discount factor and the criteria for ensuring system stability are established to assist in obtaining the stabilized optimal policy pair, which not only makes the cost function converge to the optimal value, but also guarantees the asymptotic stability of the closed-loop system. Finally, practical examples for the power system and the ball-beam system are conducted to demonstrate the effectiveness of the presented method. Note to Practitioners—Since there exist a multitude of dynamic systems with uncertainty and interference, the zero-sum game problems are ubiquitous, especially when dealing with dynamic systems featuring antagonistic properties. As an important research direction in the field of optimal control, zero-sum games usually involve designing policy pairs that can optimize the system performance in the presence of adversarial disturbances. Due to the excellent adaptability, value iteration in adaptive dynamic programming is employed to deal with this kind of issues. In addition to focusing on the optimality of policies, the system stability during the control process is equally significance, where the stability is the premise of all operations. Therefore, we are dedicated to providing guidance on the optimal regulation of discrete-time zero-sum games with performance guarantee, which contributes to obtain the stable optimal policy pair. Theoretical analysis of the stability is provided and the asymptotic stability of the system is ensured, which improves the performance of the designed controller. Furthermore, simulation experiments for practical applications are conducted, which verify the feasibility and effectiveness of the proposed control design.\",\"PeriodicalId\":51060,\"journal\":{\"name\":\"IEEE Transactions on Automation Science and Engineering\",\"volume\":\"22 \",\"pages\":\"11706-11716\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2025-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Automation Science and Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10877926/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10877926/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

针对离散时间零和博弈的最优调节问题,基于折现值迭代算法,建立了具有性能保证的自适应评论家设计。采用数值迭代方法求解非线性系统的Hamilton-Jacobi-Isaacs方程和线性系统的博弈代数Riccati方程的近似最优解。然后,重点研究了在数值迭代过程中引入折现因子对系统稳定性的影响以及策略对的可容许性。建立了折现因子的适当选择范围和保证系统稳定性的准则,以帮助获得稳定的最优策略对,使代价函数收敛于最优值,同时保证了闭环系统的渐近稳定。最后,以电力系统和球梁系统为例,验证了该方法的有效性。从业人员注意:由于存在大量具有不确定性和干扰的动态系统,零和博弈问题无处不在,特别是在处理具有对抗性的动态系统时。零和博弈是最优控制领域的一个重要研究方向,通常涉及在存在对抗性干扰的情况下设计能够优化系统性能的策略对。由于自适应动态规划具有良好的自适应性,因此采用自适应动态规划中的值迭代来处理这类问题。除了关注策略的最优性外,控制过程中的系统稳定性也同样重要,稳定性是所有操作的前提。因此,我们致力于为具有性能保证的离散时间零和博弈的最优调控提供指导,有助于获得稳定的最优政策对。对系统的稳定性进行了理论分析,保证了系统的渐近稳定,提高了所设计控制器的性能。最后进行了实际应用的仿真实验,验证了所提控制设计的可行性和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Discounted Stable Adaptive Critic Design for Zero-Sum Games With Application Verifications
In this paper, an adaptive critic design with performance guarantee is established based on the discounted value iteration algorithm to settle with the optimal regulation problem for discrete-time zero-sum games. Value iteration is implemented to obtain the approximate optimal solutions to the Hamilton-Jacobi-Isaacs equation for nonlinear systems and the game algebraic Riccati equation for linear systems. Then, we focus on system stability affected by the introduction of the discount factor and the admissibility of the policy pairs in the value iteration process. The appropriate selection range of the discount factor and the criteria for ensuring system stability are established to assist in obtaining the stabilized optimal policy pair, which not only makes the cost function converge to the optimal value, but also guarantees the asymptotic stability of the closed-loop system. Finally, practical examples for the power system and the ball-beam system are conducted to demonstrate the effectiveness of the presented method. Note to Practitioners—Since there exist a multitude of dynamic systems with uncertainty and interference, the zero-sum game problems are ubiquitous, especially when dealing with dynamic systems featuring antagonistic properties. As an important research direction in the field of optimal control, zero-sum games usually involve designing policy pairs that can optimize the system performance in the presence of adversarial disturbances. Due to the excellent adaptability, value iteration in adaptive dynamic programming is employed to deal with this kind of issues. In addition to focusing on the optimality of policies, the system stability during the control process is equally significance, where the stability is the premise of all operations. Therefore, we are dedicated to providing guidance on the optimal regulation of discrete-time zero-sum games with performance guarantee, which contributes to obtain the stable optimal policy pair. Theoretical analysis of the stability is provided and the asymptotic stability of the system is ensured, which improves the performance of the designed controller. Furthermore, simulation experiments for practical applications are conducted, which verify the feasibility and effectiveness of the proposed control design.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Automation Science and Engineering
IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统
CiteScore
12.50
自引率
14.30%
发文量
404
审稿时长
3.0 months
期刊介绍: The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信