部分可观测非线性系统控制的基于信息状态的强化学习。

IF 8.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Raman Goyal, Mohamed Naveed Gul Mohamed, Ran Wang, Aayushman Sharma, Suman Chakravorty
{"title":"部分可观测非线性系统控制的基于信息状态的强化学习。","authors":"Raman Goyal, Mohamed Naveed Gul Mohamed, Ran Wang, Aayushman Sharma, Suman Chakravorty","doi":"10.1109/TNNLS.2025.3593259","DOIUrl":null,"url":null,"abstract":"<p><p>This article develops a model-based reinforcement learning (RL) approach to the closed-loop control of nonlinear dynamical systems with a partial nonlinear observation model. We propose an \"information-state\"-based approach to rigorously transform the partially observed problem into a fully observed problem where the information state consists of the past several observations and control inputs. We further show the equivalence of the transformed and the initial partially observed optimal control problems and provide the conditions to solve for the deterministic optimal solution. We develop a data-based generalization of the iterative linear quadratic regulator (ILQR) for the RL of partially observed systems using a local linear time-varying model of the information-state dynamics approximated by an autoregressive-moving-average (ARMA) model that is generated using only the input-output data. This approach allows us to design a local perturbation feedback control law that provides an optimum solution to the partially observed feedback design problem locally. The efficacy of the developed method is shown by controlling complex high-dimensional nonlinear dynamical systems in the presence of model and sensing uncertainty.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":8.9000,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Information-State-Based Reinforcement Learning for the Control of Partially Observed Nonlinear Systems.\",\"authors\":\"Raman Goyal, Mohamed Naveed Gul Mohamed, Ran Wang, Aayushman Sharma, Suman Chakravorty\",\"doi\":\"10.1109/TNNLS.2025.3593259\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This article develops a model-based reinforcement learning (RL) approach to the closed-loop control of nonlinear dynamical systems with a partial nonlinear observation model. We propose an \\\"information-state\\\"-based approach to rigorously transform the partially observed problem into a fully observed problem where the information state consists of the past several observations and control inputs. We further show the equivalence of the transformed and the initial partially observed optimal control problems and provide the conditions to solve for the deterministic optimal solution. We develop a data-based generalization of the iterative linear quadratic regulator (ILQR) for the RL of partially observed systems using a local linear time-varying model of the information-state dynamics approximated by an autoregressive-moving-average (ARMA) model that is generated using only the input-output data. This approach allows us to design a local perturbation feedback control law that provides an optimum solution to the partially observed feedback design problem locally. The efficacy of the developed method is shown by controlling complex high-dimensional nonlinear dynamical systems in the presence of model and sensing uncertainty.</p>\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/TNNLS.2025.3593259\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TNNLS.2025.3593259","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种基于模型的强化学习(RL)方法,用于局部非线性观测模型的非线性动力系统的闭环控制。我们提出了一种基于“信息状态”的方法来严格地将部分观察到的问题转换为完全观察到的问题,其中信息状态由过去的几个观察和控制输入组成。进一步证明了变换后的最优控制问题与初始部分观测最优控制问题的等价性,并提供了求解确定性最优解的条件。我们开发了一种基于数据的迭代线性二次调节器(ILQR),用于部分观测系统的RL,使用局部线性时变信息状态动力学模型,该模型由仅使用输入-输出数据生成的自回归移动平均(ARMA)模型近似。这种方法允许我们设计一个局部摄动反馈控制律,为局部观察反馈设计问题提供最优解。通过对存在模型和感知不确定性的复杂高维非线性动力系统进行控制,证明了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Information-State-Based Reinforcement Learning for the Control of Partially Observed Nonlinear Systems.

This article develops a model-based reinforcement learning (RL) approach to the closed-loop control of nonlinear dynamical systems with a partial nonlinear observation model. We propose an "information-state"-based approach to rigorously transform the partially observed problem into a fully observed problem where the information state consists of the past several observations and control inputs. We further show the equivalence of the transformed and the initial partially observed optimal control problems and provide the conditions to solve for the deterministic optimal solution. We develop a data-based generalization of the iterative linear quadratic regulator (ILQR) for the RL of partially observed systems using a local linear time-varying model of the information-state dynamics approximated by an autoregressive-moving-average (ARMA) model that is generated using only the input-output data. This approach allows us to design a local perturbation feedback control law that provides an optimum solution to the partially observed feedback design problem locally. The efficacy of the developed method is shown by controlling complex high-dimensional nonlinear dynamical systems in the presence of model and sensing uncertainty.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE transactions on neural networks and learning systems
IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
CiteScore
23.80
自引率
9.60%
发文量
2102
审稿时长
3-8 weeks
期刊介绍: The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信