Continuous-time Value-Iteration-Based Learning for Constrained-Input Nonlinear Nonzero-Sum Game

2022 4th International Conference on Data-driven Optimization of Complex Systems (DOCS) Pub Date : 2022-10-28 DOI:10.1109/DOCS55193.2022.9967754

Geyang Xiao, Yuan Liang, Linlin Yan, Xiaoyu Yi, Congqi Shen, Huifeng Zhang

引用次数: 0

Abstract

A continuous-time value iteration based learning method is proposed for constrained-input nonlinear nonzero-sum game in this paper. Most existing studies were based on policy iteration, and thus they require an initial admissible control policy as the initial condition or some proper control policy to make the states satisfy the persistent excitation (PE) condition. However, no mater the initial admissible control policy nor a PE satisfied control policy, they can not be derived by a general feasible way. Such difficulty of choosing control policy may limit the actual application. The proposed method is developed based on value iteration and the requirement of choosing proper control policy can be avoided. Moreover, since the control signal should always be designed within limits in practice, the constrained-input property is taken into consideration. Simulation results are displayed to show the effectiveness.

查看原文本刊更多论文

基于连续时间值迭代的约束输入非线性非零和博弈学习

针对约束输入非线性非零和博弈问题，提出了一种基于连续时间值迭代的学习方法。现有的研究大多是基于策略迭代的，需要一个初始的可接受控制策略作为初始条件或适当的控制策略使状态满足持续激励(PE)条件。然而，无论是初始可接受的控制策略，还是PE满意的控制策略，都无法用一般可行的方法推导出来。这种控制策略选择的困难可能会限制实际应用。该方法基于数值迭代，避免了选择合适控制策略的要求。此外，由于在实际操作中控制信号的设计总是限制在一定范围内，因此考虑了约束输入的特性。仿真结果表明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 4th International Conference on Data-driven Optimization of Complex Systems (DOCS)

自引率

0.00%

发文量