基于启发式深度强化学习的球板系统轨迹跟踪PID控制

IF 1.7 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Information and Telecommunication Pub Date : 2020-10-21 DOI:10.1080/24751839.2020.1833137

E. Okafor, D. Udekwe, Y. Ibrahim, M. B. Mu'azu, E. Okafor

{"title":"基于启发式深度强化学习的球板系统轨迹跟踪PID控制","authors":"E. Okafor, D. Udekwe, Y. Ibrahim, M. B. Mu'azu, E. Okafor","doi":"10.1080/24751839.2020.1833137","DOIUrl":null,"url":null,"abstract":"ABSTRACT The manual tuning of controller parameters, for example, tuning proportional integral derivative (PID) gains often relies on tedious human engineering. To curb the aforementioned problem, we propose an artificial intelligence-based deep reinforcement learning (RL) PID controller (three variants) compared with genetic algorithm-based PID (GA-PID) and classical PID; a total of five controllers were simulated for controlling and trajectory tracking of the ball dynamics in a linearized ball-and-plate ( ) system. For the experiments, we trained novel variants of deep RL-PID built from a customized deep deterministic policy gradient (DDPG) agent (by modifying the neural network architecture), resulting in two new RL agents (DDPG-FC-350-R-PID & DDPG-FC-350-E-PID). Each of the agents interacts with the environment through a policy and a learning algorithm to produce a set of actions (optimal PID gains). Additionally, we evaluated the five controllers to assess which method provides the best performance metrics in the context of the minimum index in predictive errors, steady-state-error, peak overshoot, and time-responses. The results show that our proposed architecture (DDPG-FC-350-E-PID) yielded the best performance and surpasses all other approaches on most of the evaluation metric indices. Furthermore, an appropriate training of an artificial intelligence-based controller can aid to obtain the best path tracking.","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":"5 1","pages":"179 - 196"},"PeriodicalIF":1.7000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24751839.2020.1833137","citationCount":"8","resultStr":"{\"title\":\"Heuristic and deep reinforcement learning-based PID control of trajectory tracking in a ball-and-plate system\",\"authors\":\"E. Okafor, D. Udekwe, Y. Ibrahim, M. B. Mu'azu, E. Okafor\",\"doi\":\"10.1080/24751839.2020.1833137\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACT The manual tuning of controller parameters, for example, tuning proportional integral derivative (PID) gains often relies on tedious human engineering. To curb the aforementioned problem, we propose an artificial intelligence-based deep reinforcement learning (RL) PID controller (three variants) compared with genetic algorithm-based PID (GA-PID) and classical PID; a total of five controllers were simulated for controlling and trajectory tracking of the ball dynamics in a linearized ball-and-plate ( ) system. For the experiments, we trained novel variants of deep RL-PID built from a customized deep deterministic policy gradient (DDPG) agent (by modifying the neural network architecture), resulting in two new RL agents (DDPG-FC-350-R-PID & DDPG-FC-350-E-PID). Each of the agents interacts with the environment through a policy and a learning algorithm to produce a set of actions (optimal PID gains). Additionally, we evaluated the five controllers to assess which method provides the best performance metrics in the context of the minimum index in predictive errors, steady-state-error, peak overshoot, and time-responses. The results show that our proposed architecture (DDPG-FC-350-E-PID) yielded the best performance and surpasses all other approaches on most of the evaluation metric indices. Furthermore, an appropriate training of an artificial intelligence-based controller can aid to obtain the best path tracking.\",\"PeriodicalId\":32180,\"journal\":{\"name\":\"Journal of Information and Telecommunication\",\"volume\":\"5 1\",\"pages\":\"179 - 196\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2020-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1080/24751839.2020.1833137\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information and Telecommunication\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/24751839.2020.1833137\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information and Telecommunication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/24751839.2020.1833137","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 8

摘要

摘要控制器参数的手动调整，例如调整比例积分微分（PID）增益，通常依赖于繁琐的人类工程。为了解决上述问题，与基于遗传算法的PID（GA-PID）和经典PID相比，我们提出了一种基于人工智能的深度强化学习（RL）PID控制器（三种变体）；在线性化的球-板（）系统中，总共模拟了五个控制器来控制和跟踪球的动力学。对于实验，我们训练了从定制的深度确定性策略梯度（DDPG）代理构建的深度RL-PID的新变体（通过修改神经网络架构），产生了两个新的RL代理（DDPG-FC-350-R-PID和DDPG-FC-350-EPID）。每个代理通过策略和学习算法与环境交互，以产生一组动作（最优PID增益）。此外，我们评估了五个控制器，以评估哪种方法在预测误差、稳态误差、峰值超调和时间响应的最小指数的情况下提供了最佳性能指标。结果表明，我们提出的体系结构（DDPG-FC-350-E-PID）产生了最好的性能，并在大多数评估指标上超过了所有其他方法。此外，对基于人工智能的控制器进行适当的训练可以帮助获得最佳路径跟踪。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Heuristic and deep reinforcement learning-based PID control of trajectory tracking in a ball-and-plate system

ABSTRACT The manual tuning of controller parameters, for example, tuning proportional integral derivative (PID) gains often relies on tedious human engineering. To curb the aforementioned problem, we propose an artificial intelligence-based deep reinforcement learning (RL) PID controller (three variants) compared with genetic algorithm-based PID (GA-PID) and classical PID; a total of five controllers were simulated for controlling and trajectory tracking of the ball dynamics in a linearized ball-and-plate ( ) system. For the experiments, we trained novel variants of deep RL-PID built from a customized deep deterministic policy gradient (DDPG) agent (by modifying the neural network architecture), resulting in two new RL agents (DDPG-FC-350-R-PID & DDPG-FC-350-E-PID). Each of the agents interacts with the environment through a policy and a learning algorithm to produce a set of actions (optimal PID gains). Additionally, we evaluated the five controllers to assess which method provides the best performance metrics in the context of the minimum index in predictive errors, steady-state-error, peak overshoot, and time-responses. The results show that our proposed architecture (DDPG-FC-350-E-PID) yielded the best performance and surpasses all other approaches on most of the evaluation metric indices. Furthermore, an appropriate training of an artificial intelligence-based controller can aid to obtain the best path tracking.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Information and Telecommunication Multiple-

CiteScore

7.50

自引率

0.00%

发文量

审稿时长

27 weeks