基于强化学习的离散时间系统数据驱动干扰补偿控制

IF 3.9 4区 计算机科学 Q2 AUTOMATION & CONTROL SYSTEMS
Lanyue Li, Jinna Li, Jiangtao Cao
{"title":"基于强化学习的离散时间系统数据驱动干扰补偿控制","authors":"Lanyue Li, Jinna Li, Jiangtao Cao","doi":"10.1002/acs.3793","DOIUrl":null,"url":null,"abstract":"SummaryIn this article, a self‐learning disturbance compensation control method is developed, which enables the unknown discrete‐time (DT) systems to achieve performance optimization in the presence of disturbances. Different from traditional model‐based and data‐driven state feedback control methods, the developed off‐policy Q‐learning algorithm updates the state feedback controller parameters and the compensator parameters by actively interacting with the unknown environment, thus the approximately optimal tracking can be realized using only data. First, an optimal tracking problem for a linear DT system with disturbance is formulated. Then, the design for controller is achieved by solving a zero‐sum game problem, leading to an off‐policy disturbance compensation Q‐learning algorithm with only a critic structure, which uses data to update disturbance compensation controller gains, without the knowledge of system dynamics. Finally, the effectiveness of the proposed method is verified by simulations.","PeriodicalId":50347,"journal":{"name":"International Journal of Adaptive Control and Signal Processing","volume":"159 1","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data‐driven disturbance compensation control for discrete‐time systems based on reinforcement learning\",\"authors\":\"Lanyue Li, Jinna Li, Jiangtao Cao\",\"doi\":\"10.1002/acs.3793\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"SummaryIn this article, a self‐learning disturbance compensation control method is developed, which enables the unknown discrete‐time (DT) systems to achieve performance optimization in the presence of disturbances. Different from traditional model‐based and data‐driven state feedback control methods, the developed off‐policy Q‐learning algorithm updates the state feedback controller parameters and the compensator parameters by actively interacting with the unknown environment, thus the approximately optimal tracking can be realized using only data. First, an optimal tracking problem for a linear DT system with disturbance is formulated. Then, the design for controller is achieved by solving a zero‐sum game problem, leading to an off‐policy disturbance compensation Q‐learning algorithm with only a critic structure, which uses data to update disturbance compensation controller gains, without the knowledge of system dynamics. Finally, the effectiveness of the proposed method is verified by simulations.\",\"PeriodicalId\":50347,\"journal\":{\"name\":\"International Journal of Adaptive Control and Signal Processing\",\"volume\":\"159 1\",\"pages\":\"\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-03-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Adaptive Control and Signal Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1002/acs.3793\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Adaptive Control and Signal Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1002/acs.3793","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

摘要本文提出了一种自学习干扰补偿控制方法,使未知离散时间(DT)系统在存在干扰的情况下实现性能优化。与传统的基于模型和数据驱动的状态反馈控制方法不同,所开发的非策略 Q-learning 算法通过主动与未知环境交互来更新状态反馈控制器参数和补偿器参数,因此只需使用数据即可实现近似最优跟踪。首先,提出了带扰动的线性 DT 系统的最优跟踪问题。然后,通过求解一个零和博弈问题来实现控制器的设计,从而得出一种仅有批判结构的非策略干扰补偿 Q-learning 算法,该算法利用数据更新干扰补偿控制器增益,而无需了解系统动态。最后,通过仿真验证了所提方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data‐driven disturbance compensation control for discrete‐time systems based on reinforcement learning
SummaryIn this article, a self‐learning disturbance compensation control method is developed, which enables the unknown discrete‐time (DT) systems to achieve performance optimization in the presence of disturbances. Different from traditional model‐based and data‐driven state feedback control methods, the developed off‐policy Q‐learning algorithm updates the state feedback controller parameters and the compensator parameters by actively interacting with the unknown environment, thus the approximately optimal tracking can be realized using only data. First, an optimal tracking problem for a linear DT system with disturbance is formulated. Then, the design for controller is achieved by solving a zero‐sum game problem, leading to an off‐policy disturbance compensation Q‐learning algorithm with only a critic structure, which uses data to update disturbance compensation controller gains, without the knowledge of system dynamics. Finally, the effectiveness of the proposed method is verified by simulations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.30
自引率
16.10%
发文量
163
审稿时长
5 months
期刊介绍: The International Journal of Adaptive Control and Signal Processing is concerned with the design, synthesis and application of estimators or controllers where adaptive features are needed to cope with uncertainties.Papers on signal processing should also have some relevance to adaptive systems. The journal focus is on model based control design approaches rather than heuristic or rule based control design methods. All papers will be expected to include significant novel material. Both the theory and application of adaptive systems and system identification are areas of interest. Papers on applications can include problems in the implementation of algorithms for real time signal processing and control. The stability, convergence, robustness and numerical aspects of adaptive algorithms are also suitable topics. The related subjects of controller tuning, filtering, networks and switching theory are also of interest. Principal areas to be addressed include: Auto-Tuning, Self-Tuning and Model Reference Adaptive Controllers Nonlinear, Robust and Intelligent Adaptive Controllers Linear and Nonlinear Multivariable System Identification and Estimation Identification of Linear Parameter Varying, Distributed and Hybrid Systems Multiple Model Adaptive Control Adaptive Signal processing Theory and Algorithms Adaptation in Multi-Agent Systems Condition Monitoring Systems Fault Detection and Isolation Methods Fault Detection and Isolation Methods Fault-Tolerant Control (system supervision and diagnosis) Learning Systems and Adaptive Modelling Real Time Algorithms for Adaptive Signal Processing and Control Adaptive Signal Processing and Control Applications Adaptive Cloud Architectures and Networking Adaptive Mechanisms for Internet of Things Adaptive Sliding Mode Control.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信