Parameterized Adaptive Controller Design using Reinforcement Learning and Deep Neural Networks

Kranthi Kumar P, K. Detroja
{"title":"Parameterized Adaptive Controller Design using Reinforcement Learning and Deep Neural Networks","authors":"Kranthi Kumar P, K. Detroja","doi":"10.1109/ICC56513.2022.10093404","DOIUrl":null,"url":null,"abstract":"This manuscript aims to build a parameterized adaptive controller using Reinforcement Learning (RL) and Deep Neural Networks (DNN). The main objective is to adapt parameters of any given controller structure using RL formulation and achieve better performance. In recent years, reinforcement learning has gained much attention, and its advantages make it ideal for adaptive tuning of parameterized controllers. With the advancement of computational power, it has become easier to approximate a complex policy function using a deep neural network to achieve better accuracy and performance. Conventional Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm may not be able to provide best possible training for a given number of episodes. To improve performance of the TD3 algorithm, dynamic action space is proposed along with modified reward function, designed to aid faster convergence. The proposed algorithm provides improved performance by dynamically modifying the action space in the existing TD3 algorithm. The effectiveness of the proposed RL-based parameterized controller is demonstrated through a standard first order system by designing an adaptive PI controller. A case study involving a 3 DOF (Degree of Freedom) gyroscope system, which is an unstable plant, is also presented. For the 3 DOF gyroscope system an adaptive Lead controller is designed, where the proposed algorithm provides faster convergence and better performance compared to the original TD3 algorithm.","PeriodicalId":101654,"journal":{"name":"2022 Eighth Indian Control Conference (ICC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Eighth Indian Control Conference (ICC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICC56513.2022.10093404","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This manuscript aims to build a parameterized adaptive controller using Reinforcement Learning (RL) and Deep Neural Networks (DNN). The main objective is to adapt parameters of any given controller structure using RL formulation and achieve better performance. In recent years, reinforcement learning has gained much attention, and its advantages make it ideal for adaptive tuning of parameterized controllers. With the advancement of computational power, it has become easier to approximate a complex policy function using a deep neural network to achieve better accuracy and performance. Conventional Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm may not be able to provide best possible training for a given number of episodes. To improve performance of the TD3 algorithm, dynamic action space is proposed along with modified reward function, designed to aid faster convergence. The proposed algorithm provides improved performance by dynamically modifying the action space in the existing TD3 algorithm. The effectiveness of the proposed RL-based parameterized controller is demonstrated through a standard first order system by designing an adaptive PI controller. A case study involving a 3 DOF (Degree of Freedom) gyroscope system, which is an unstable plant, is also presented. For the 3 DOF gyroscope system an adaptive Lead controller is designed, where the proposed algorithm provides faster convergence and better performance compared to the original TD3 algorithm.
基于强化学习和深度神经网络的参数化自适应控制器设计
本文旨在使用强化学习(RL)和深度神经网络(DNN)构建参数化自适应控制器。主要目标是使用RL公式来适应任意给定控制器结构的参数并获得更好的性能。近年来,强化学习得到了广泛的关注,其优点使其成为参数化控制器自适应整定的理想方法。随着计算能力的提高,使用深度神经网络来近似复杂的策略函数变得更加容易,从而达到更好的精度和性能。传统的双延迟深度确定性策略梯度(TD3)算法可能无法为给定的集数提供最好的训练。为了提高TD3算法的性能,提出了动态动作空间和改进的奖励函数,旨在帮助更快的收敛。该算法通过动态修改现有TD3算法中的动作空间,提高了算法的性能。通过设计自适应PI控制器,通过标准一阶系统验证了基于rl的参数化控制器的有效性。最后给出了一个不稳定对象的3自由度陀螺仪系统的实例分析。针对三自由度陀螺仪系统,设计了自适应引线控制器,与原TD3算法相比,该算法具有更快的收敛速度和更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信