基于强化学习的神经网络控制电缆SCARA机器人

IF 1.9 4区工程技术 Q3 ENGINEERING, MECHANICAL

Journal of Computational and Nonlinear Dynamics Pub Date : 2023-08-24 DOI:10.1115/1.4063222

E. Okabe, Victor Paiva, Luis Silva-Teixeira, J. Izuka

{"title":"基于强化学习的神经网络控制电缆SCARA机器人","authors":"E. Okabe, Victor Paiva, Luis Silva-Teixeira, J. Izuka","doi":"10.1115/1.4063222","DOIUrl":null,"url":null,"abstract":"\n In this work, three reinforcement learning algorithms (Proximal Policy Optimization, Soft Actor-Critic and Twin Delayed Deep Deterministic Policy Gradient) are employed to control a two link SCARA robot. This robot has three cables attached to its end-effector, which creates a triangular shaped workspace. Positioning the end-effector in the workspace is a relatively simple kinematic problem, but moving outside this region, although possible, requires a nonlinear dynamic model and a state-of-the-art controller. To solve this problem in a simple manner, reinforcement learning algorithms are used to find possible trajectories for three targets out of the workspace. Additionally, the SCARA mechanism offers two possible configurations for each end-effector position. The algorithm results are compared in terms of displacement error, velocity and standard deviation among ten trajectories provided by the trained network. The results indicate the Proximal Policy Algorithm as the most consistent in the analyzed situations. Still, the Soft Actor-Critic presented better solutions, and Twin Delayed Deep Deterministic Policy Gradient provided interesting and more unusual trajectories.","PeriodicalId":54858,"journal":{"name":"Journal of Computational and Nonlinear Dynamics","volume":"3 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning\",\"authors\":\"E. Okabe, Victor Paiva, Luis Silva-Teixeira, J. Izuka\",\"doi\":\"10.1115/1.4063222\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n In this work, three reinforcement learning algorithms (Proximal Policy Optimization, Soft Actor-Critic and Twin Delayed Deep Deterministic Policy Gradient) are employed to control a two link SCARA robot. This robot has three cables attached to its end-effector, which creates a triangular shaped workspace. Positioning the end-effector in the workspace is a relatively simple kinematic problem, but moving outside this region, although possible, requires a nonlinear dynamic model and a state-of-the-art controller. To solve this problem in a simple manner, reinforcement learning algorithms are used to find possible trajectories for three targets out of the workspace. Additionally, the SCARA mechanism offers two possible configurations for each end-effector position. The algorithm results are compared in terms of displacement error, velocity and standard deviation among ten trajectories provided by the trained network. The results indicate the Proximal Policy Algorithm as the most consistent in the analyzed situations. Still, the Soft Actor-Critic presented better solutions, and Twin Delayed Deep Deterministic Policy Gradient provided interesting and more unusual trajectories.\",\"PeriodicalId\":54858,\"journal\":{\"name\":\"Journal of Computational and Nonlinear Dynamics\",\"volume\":\"3 1\",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2023-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computational and Nonlinear Dynamics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1115/1.4063222\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, MECHANICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational and Nonlinear Dynamics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1115/1.4063222","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, MECHANICAL","Score":null,"Total":0}

引用次数: 0

摘要

在这项工作中，采用三种强化学习算法(近端策略优化，软行为者批评和双延迟深度确定性策略梯度)来控制双连杆SCARA机器人。这个机器人的末端执行器上有三条电缆，这样就形成了一个三角形的工作空间。末端执行器在工作空间中的定位是一个相对简单的运动学问题，但移动到该区域之外，虽然可能，但需要非线性动态模型和最先进的控制器。为了以一种简单的方式解决这个问题，我们使用强化学习算法来寻找工作空间外三个目标的可能轨迹。此外，SCARA机构为每个末端执行器位置提供了两种可能的配置。在训练后的网络提供的10条轨迹中，对算法结果进行了位移误差、速度和标准差的比较。结果表明，在分析的情况下，最接近策略算法是最一致的。尽管如此，软行为批评家提出了更好的解决方案，双延迟深度确定性政策梯度提供了有趣的和更不寻常的轨迹。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning

In this work, three reinforcement learning algorithms (Proximal Policy Optimization, Soft Actor-Critic and Twin Delayed Deep Deterministic Policy Gradient) are employed to control a two link SCARA robot. This robot has three cables attached to its end-effector, which creates a triangular shaped workspace. Positioning the end-effector in the workspace is a relatively simple kinematic problem, but moving outside this region, although possible, requires a nonlinear dynamic model and a state-of-the-art controller. To solve this problem in a simple manner, reinforcement learning algorithms are used to find possible trajectories for three targets out of the workspace. Additionally, the SCARA mechanism offers two possible configurations for each end-effector position. The algorithm results are compared in terms of displacement error, velocity and standard deviation among ten trajectories provided by the trained network. The results indicate the Proximal Policy Algorithm as the most consistent in the analyzed situations. Still, the Soft Actor-Critic presented better solutions, and Twin Delayed Deep Deterministic Policy Gradient provided interesting and more unusual trajectories.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Computational and Nonlinear Dynamics 工程技术-工程：机械

CiteScore

4.00

自引率

10.00%

发文量

审稿时长

6-12 weeks

期刊介绍： The purpose of the Journal of Computational and Nonlinear Dynamics is to provide a medium for rapid dissemination of original research results in theoretical as well as applied computational and nonlinear dynamics. The journal serves as a forum for the exchange of new ideas and applications in computational, rigid and flexible multi-body system dynamics and all aspects (analytical, numerical, and experimental) of dynamics associated with nonlinear systems. The broad scope of the journal encompasses all computational and nonlinear problems occurring in aeronautical, biological, electrical, mechanical, physical, and structural systems.