{"title":"零和游戏的折扣稳定自适应批判设计与应用验证","authors":"Jin Ren;Ding Wang;Menghua Li;Junfei Qiao","doi":"10.1109/TASE.2025.3539772","DOIUrl":null,"url":null,"abstract":"In this paper, an adaptive critic design with performance guarantee is established based on the discounted value iteration algorithm to settle with the optimal regulation problem for discrete-time zero-sum games. Value iteration is implemented to obtain the approximate optimal solutions to the Hamilton-Jacobi-Isaacs equation for nonlinear systems and the game algebraic Riccati equation for linear systems. Then, we focus on system stability affected by the introduction of the discount factor and the admissibility of the policy pairs in the value iteration process. The appropriate selection range of the discount factor and the criteria for ensuring system stability are established to assist in obtaining the stabilized optimal policy pair, which not only makes the cost function converge to the optimal value, but also guarantees the asymptotic stability of the closed-loop system. Finally, practical examples for the power system and the ball-beam system are conducted to demonstrate the effectiveness of the presented method. Note to Practitioners—Since there exist a multitude of dynamic systems with uncertainty and interference, the zero-sum game problems are ubiquitous, especially when dealing with dynamic systems featuring antagonistic properties. As an important research direction in the field of optimal control, zero-sum games usually involve designing policy pairs that can optimize the system performance in the presence of adversarial disturbances. Due to the excellent adaptability, value iteration in adaptive dynamic programming is employed to deal with this kind of issues. In addition to focusing on the optimality of policies, the system stability during the control process is equally significance, where the stability is the premise of all operations. Therefore, we are dedicated to providing guidance on the optimal regulation of discrete-time zero-sum games with performance guarantee, which contributes to obtain the stable optimal policy pair. Theoretical analysis of the stability is provided and the asymptotic stability of the system is ensured, which improves the performance of the designed controller. Furthermore, simulation experiments for practical applications are conducted, which verify the feasibility and effectiveness of the proposed control design.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"11706-11716"},"PeriodicalIF":6.4000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Discounted Stable Adaptive Critic Design for Zero-Sum Games With Application Verifications\",\"authors\":\"Jin Ren;Ding Wang;Menghua Li;Junfei Qiao\",\"doi\":\"10.1109/TASE.2025.3539772\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, an adaptive critic design with performance guarantee is established based on the discounted value iteration algorithm to settle with the optimal regulation problem for discrete-time zero-sum games. Value iteration is implemented to obtain the approximate optimal solutions to the Hamilton-Jacobi-Isaacs equation for nonlinear systems and the game algebraic Riccati equation for linear systems. Then, we focus on system stability affected by the introduction of the discount factor and the admissibility of the policy pairs in the value iteration process. The appropriate selection range of the discount factor and the criteria for ensuring system stability are established to assist in obtaining the stabilized optimal policy pair, which not only makes the cost function converge to the optimal value, but also guarantees the asymptotic stability of the closed-loop system. Finally, practical examples for the power system and the ball-beam system are conducted to demonstrate the effectiveness of the presented method. Note to Practitioners—Since there exist a multitude of dynamic systems with uncertainty and interference, the zero-sum game problems are ubiquitous, especially when dealing with dynamic systems featuring antagonistic properties. As an important research direction in the field of optimal control, zero-sum games usually involve designing policy pairs that can optimize the system performance in the presence of adversarial disturbances. Due to the excellent adaptability, value iteration in adaptive dynamic programming is employed to deal with this kind of issues. In addition to focusing on the optimality of policies, the system stability during the control process is equally significance, where the stability is the premise of all operations. Therefore, we are dedicated to providing guidance on the optimal regulation of discrete-time zero-sum games with performance guarantee, which contributes to obtain the stable optimal policy pair. Theoretical analysis of the stability is provided and the asymptotic stability of the system is ensured, which improves the performance of the designed controller. Furthermore, simulation experiments for practical applications are conducted, which verify the feasibility and effectiveness of the proposed control design.\",\"PeriodicalId\":51060,\"journal\":{\"name\":\"IEEE Transactions on Automation Science and Engineering\",\"volume\":\"22 \",\"pages\":\"11706-11716\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2025-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Automation Science and Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10877926/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10877926/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Discounted Stable Adaptive Critic Design for Zero-Sum Games With Application Verifications
In this paper, an adaptive critic design with performance guarantee is established based on the discounted value iteration algorithm to settle with the optimal regulation problem for discrete-time zero-sum games. Value iteration is implemented to obtain the approximate optimal solutions to the Hamilton-Jacobi-Isaacs equation for nonlinear systems and the game algebraic Riccati equation for linear systems. Then, we focus on system stability affected by the introduction of the discount factor and the admissibility of the policy pairs in the value iteration process. The appropriate selection range of the discount factor and the criteria for ensuring system stability are established to assist in obtaining the stabilized optimal policy pair, which not only makes the cost function converge to the optimal value, but also guarantees the asymptotic stability of the closed-loop system. Finally, practical examples for the power system and the ball-beam system are conducted to demonstrate the effectiveness of the presented method. Note to Practitioners—Since there exist a multitude of dynamic systems with uncertainty and interference, the zero-sum game problems are ubiquitous, especially when dealing with dynamic systems featuring antagonistic properties. As an important research direction in the field of optimal control, zero-sum games usually involve designing policy pairs that can optimize the system performance in the presence of adversarial disturbances. Due to the excellent adaptability, value iteration in adaptive dynamic programming is employed to deal with this kind of issues. In addition to focusing on the optimality of policies, the system stability during the control process is equally significance, where the stability is the premise of all operations. Therefore, we are dedicated to providing guidance on the optimal regulation of discrete-time zero-sum games with performance guarantee, which contributes to obtain the stable optimal policy pair. Theoretical analysis of the stability is provided and the asymptotic stability of the system is ensured, which improves the performance of the designed controller. Furthermore, simulation experiments for practical applications are conducted, which verify the feasibility and effectiveness of the proposed control design.
期刊介绍:
The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.