Heuristic and deep reinforcement learning-based PID control of trajectory tracking in a ball-and-plate system

IF 2.7 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
E. Okafor, D. Udekwe, Y. Ibrahim, M. B. Mu'azu, E. Okafor
{"title":"Heuristic and deep reinforcement learning-based PID control of trajectory tracking in a ball-and-plate system","authors":"E. Okafor, D. Udekwe, Y. Ibrahim, M. B. Mu'azu, E. Okafor","doi":"10.1080/24751839.2020.1833137","DOIUrl":null,"url":null,"abstract":"ABSTRACT The manual tuning of controller parameters, for example, tuning proportional integral derivative (PID) gains often relies on tedious human engineering. To curb the aforementioned problem, we propose an artificial intelligence-based deep reinforcement learning (RL) PID controller (three variants) compared with genetic algorithm-based PID (GA-PID) and classical PID; a total of five controllers were simulated for controlling and trajectory tracking of the ball dynamics in a linearized ball-and-plate ( ) system. For the experiments, we trained novel variants of deep RL-PID built from a customized deep deterministic policy gradient (DDPG) agent (by modifying the neural network architecture), resulting in two new RL agents (DDPG-FC-350-R-PID & DDPG-FC-350-E-PID). Each of the agents interacts with the environment through a policy and a learning algorithm to produce a set of actions (optimal PID gains). Additionally, we evaluated the five controllers to assess which method provides the best performance metrics in the context of the minimum index in predictive errors, steady-state-error, peak overshoot, and time-responses. The results show that our proposed architecture (DDPG-FC-350-E-PID) yielded the best performance and surpasses all other approaches on most of the evaluation metric indices. Furthermore, an appropriate training of an artificial intelligence-based controller can aid to obtain the best path tracking.","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":"5 1","pages":"179 - 196"},"PeriodicalIF":2.7000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24751839.2020.1833137","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information and Telecommunication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/24751839.2020.1833137","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 8

Abstract

ABSTRACT The manual tuning of controller parameters, for example, tuning proportional integral derivative (PID) gains often relies on tedious human engineering. To curb the aforementioned problem, we propose an artificial intelligence-based deep reinforcement learning (RL) PID controller (three variants) compared with genetic algorithm-based PID (GA-PID) and classical PID; a total of five controllers were simulated for controlling and trajectory tracking of the ball dynamics in a linearized ball-and-plate ( ) system. For the experiments, we trained novel variants of deep RL-PID built from a customized deep deterministic policy gradient (DDPG) agent (by modifying the neural network architecture), resulting in two new RL agents (DDPG-FC-350-R-PID & DDPG-FC-350-E-PID). Each of the agents interacts with the environment through a policy and a learning algorithm to produce a set of actions (optimal PID gains). Additionally, we evaluated the five controllers to assess which method provides the best performance metrics in the context of the minimum index in predictive errors, steady-state-error, peak overshoot, and time-responses. The results show that our proposed architecture (DDPG-FC-350-E-PID) yielded the best performance and surpasses all other approaches on most of the evaluation metric indices. Furthermore, an appropriate training of an artificial intelligence-based controller can aid to obtain the best path tracking.
基于启发式深度强化学习的球板系统轨迹跟踪PID控制
摘要控制器参数的手动调整,例如调整比例积分微分(PID)增益,通常依赖于繁琐的人类工程。为了解决上述问题,与基于遗传算法的PID(GA-PID)和经典PID相比,我们提出了一种基于人工智能的深度强化学习(RL)PID控制器(三种变体);在线性化的球-板()系统中,总共模拟了五个控制器来控制和跟踪球的动力学。对于实验,我们训练了从定制的深度确定性策略梯度(DDPG)代理构建的深度RL-PID的新变体(通过修改神经网络架构),产生了两个新的RL代理(DDPG-FC-350-R-PID和DDPG-FC-350-EPID)。每个代理通过策略和学习算法与环境交互,以产生一组动作(最优PID增益)。此外,我们评估了五个控制器,以评估哪种方法在预测误差、稳态误差、峰值超调和时间响应的最小指数的情况下提供了最佳性能指标。结果表明,我们提出的体系结构(DDPG-FC-350-E-PID)产生了最好的性能,并在大多数评估指标上超过了所有其他方法。此外,对基于人工智能的控制器进行适当的训练可以帮助获得最佳路径跟踪。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.50
自引率
0.00%
发文量
18
审稿时长
27 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信