E. Okabe, Victor Paiva, Luis Silva-Teixeira, J. Izuka
{"title":"基于强化学习的神经网络控制电缆SCARA机器人","authors":"E. Okabe, Victor Paiva, Luis Silva-Teixeira, J. Izuka","doi":"10.1115/1.4063222","DOIUrl":null,"url":null,"abstract":"\n In this work, three reinforcement learning algorithms (Proximal Policy Optimization, Soft Actor-Critic and Twin Delayed Deep Deterministic Policy Gradient) are employed to control a two link SCARA robot. This robot has three cables attached to its end-effector, which creates a triangular shaped workspace. Positioning the end-effector in the workspace is a relatively simple kinematic problem, but moving outside this region, although possible, requires a nonlinear dynamic model and a state-of-the-art controller. To solve this problem in a simple manner, reinforcement learning algorithms are used to find possible trajectories for three targets out of the workspace. Additionally, the SCARA mechanism offers two possible configurations for each end-effector position. The algorithm results are compared in terms of displacement error, velocity and standard deviation among ten trajectories provided by the trained network. The results indicate the Proximal Policy Algorithm as the most consistent in the analyzed situations. Still, the Soft Actor-Critic presented better solutions, and Twin Delayed Deep Deterministic Policy Gradient provided interesting and more unusual trajectories.","PeriodicalId":54858,"journal":{"name":"Journal of Computational and Nonlinear Dynamics","volume":"3 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning\",\"authors\":\"E. Okabe, Victor Paiva, Luis Silva-Teixeira, J. Izuka\",\"doi\":\"10.1115/1.4063222\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n In this work, three reinforcement learning algorithms (Proximal Policy Optimization, Soft Actor-Critic and Twin Delayed Deep Deterministic Policy Gradient) are employed to control a two link SCARA robot. This robot has three cables attached to its end-effector, which creates a triangular shaped workspace. Positioning the end-effector in the workspace is a relatively simple kinematic problem, but moving outside this region, although possible, requires a nonlinear dynamic model and a state-of-the-art controller. To solve this problem in a simple manner, reinforcement learning algorithms are used to find possible trajectories for three targets out of the workspace. Additionally, the SCARA mechanism offers two possible configurations for each end-effector position. The algorithm results are compared in terms of displacement error, velocity and standard deviation among ten trajectories provided by the trained network. The results indicate the Proximal Policy Algorithm as the most consistent in the analyzed situations. Still, the Soft Actor-Critic presented better solutions, and Twin Delayed Deep Deterministic Policy Gradient provided interesting and more unusual trajectories.\",\"PeriodicalId\":54858,\"journal\":{\"name\":\"Journal of Computational and Nonlinear Dynamics\",\"volume\":\"3 1\",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2023-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computational and Nonlinear Dynamics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1115/1.4063222\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, MECHANICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational and Nonlinear Dynamics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1115/1.4063222","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, MECHANICAL","Score":null,"Total":0}
Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning
In this work, three reinforcement learning algorithms (Proximal Policy Optimization, Soft Actor-Critic and Twin Delayed Deep Deterministic Policy Gradient) are employed to control a two link SCARA robot. This robot has three cables attached to its end-effector, which creates a triangular shaped workspace. Positioning the end-effector in the workspace is a relatively simple kinematic problem, but moving outside this region, although possible, requires a nonlinear dynamic model and a state-of-the-art controller. To solve this problem in a simple manner, reinforcement learning algorithms are used to find possible trajectories for three targets out of the workspace. Additionally, the SCARA mechanism offers two possible configurations for each end-effector position. The algorithm results are compared in terms of displacement error, velocity and standard deviation among ten trajectories provided by the trained network. The results indicate the Proximal Policy Algorithm as the most consistent in the analyzed situations. Still, the Soft Actor-Critic presented better solutions, and Twin Delayed Deep Deterministic Policy Gradient provided interesting and more unusual trajectories.
期刊介绍:
The purpose of the Journal of Computational and Nonlinear Dynamics is to provide a medium for rapid dissemination of original research results in theoretical as well as applied computational and nonlinear dynamics. The journal serves as a forum for the exchange of new ideas and applications in computational, rigid and flexible multi-body system dynamics and all aspects (analytical, numerical, and experimental) of dynamics associated with nonlinear systems. The broad scope of the journal encompasses all computational and nonlinear problems occurring in aeronautical, biological, electrical, mechanical, physical, and structural systems.