Continuous-time Value-Iteration-Based Learning for Constrained-Input Nonlinear Nonzero-Sum Game

Geyang Xiao, Yuan Liang, Linlin Yan, Xiaoyu Yi, Congqi Shen, Huifeng Zhang
{"title":"Continuous-time Value-Iteration-Based Learning for Constrained-Input Nonlinear Nonzero-Sum Game","authors":"Geyang Xiao, Yuan Liang, Linlin Yan, Xiaoyu Yi, Congqi Shen, Huifeng Zhang","doi":"10.1109/DOCS55193.2022.9967754","DOIUrl":null,"url":null,"abstract":"A continuous-time value iteration based learning method is proposed for constrained-input nonlinear nonzero-sum game in this paper. Most existing studies were based on policy iteration, and thus they require an initial admissible control policy as the initial condition or some proper control policy to make the states satisfy the persistent excitation (PE) condition. However, no mater the initial admissible control policy nor a PE satisfied control policy, they can not be derived by a general feasible way. Such difficulty of choosing control policy may limit the actual application. The proposed method is developed based on value iteration and the requirement of choosing proper control policy can be avoided. Moreover, since the control signal should always be designed within limits in practice, the constrained-input property is taken into consideration. Simulation results are displayed to show the effectiveness.","PeriodicalId":348545,"journal":{"name":"2022 4th International Conference on Data-driven Optimization of Complex Systems (DOCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 4th International Conference on Data-driven Optimization of Complex Systems (DOCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DOCS55193.2022.9967754","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

A continuous-time value iteration based learning method is proposed for constrained-input nonlinear nonzero-sum game in this paper. Most existing studies were based on policy iteration, and thus they require an initial admissible control policy as the initial condition or some proper control policy to make the states satisfy the persistent excitation (PE) condition. However, no mater the initial admissible control policy nor a PE satisfied control policy, they can not be derived by a general feasible way. Such difficulty of choosing control policy may limit the actual application. The proposed method is developed based on value iteration and the requirement of choosing proper control policy can be avoided. Moreover, since the control signal should always be designed within limits in practice, the constrained-input property is taken into consideration. Simulation results are displayed to show the effectiveness.
基于连续时间值迭代的约束输入非线性非零和博弈学习
针对约束输入非线性非零和博弈问题,提出了一种基于连续时间值迭代的学习方法。现有的研究大多是基于策略迭代的,需要一个初始的可接受控制策略作为初始条件或适当的控制策略使状态满足持续激励(PE)条件。然而,无论是初始可接受的控制策略,还是PE满意的控制策略,都无法用一般可行的方法推导出来。这种控制策略选择的困难可能会限制实际应用。该方法基于数值迭代,避免了选择合适控制策略的要求。此外,由于在实际操作中控制信号的设计总是限制在一定范围内,因此考虑了约束输入的特性。仿真结果表明了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信