带约束条件的深度确定性策略梯度，用于优化双足机器人的步态

IF 5.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Integrated Computer-Aided Engineering Pub Date : 2023-12-15 DOI:10.3233/ica-230724

Xingyang Liu, Haina Rong, Ferrante Neri, Peng Yue, Gexiang Zhang

{"title":"带约束条件的深度确定性策略梯度，用于优化双足机器人的步态","authors":"Xingyang Liu, Haina Rong, Ferrante Neri, Peng Yue, Gexiang Zhang","doi":"10.3233/ica-230724","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a novel Reinforcement Learning (RL) algorithm for robotic motion control, that is, a constrained Deep Deterministic Policy Gradient (DDPG) deviation learning strategy to assist biped robots in walking safely and accurately. The previous research on this topic highlighted the limitations in the controller’s ability to accurately track foot placement on discrete terrains and the lack of consideration for safety concerns. In this study, we address these challenges by focusing on ensuring the overall system’s safety. To begin with, we tackle the inverse kinematics problem by introducing constraints to the damping least squares method. This enhancement not only addresses singularity issues but also guarantees safe ranges for joint angles, thus ensuring the stability and reliability of the system. Based on this, we propose the adoption of the constrained DDPG method to correct controller deviations. In constrained DDPG, we incorporate a constraint layer into the Actor network, incorporating joint deviations as state inputs. By conducting offline training within the range of safe angles, it serves as a deviation corrector. Lastly, we validate the effectiveness of our proposed approach by conducting dynamic simulations using the CRANE biped robot. Through comprehensive assessments, including singularity analysis, constraint effectiveness evaluation, and walking experiments on discrete terrains, we demonstrate the superiority and practicality of our approach in enhancing walking performance while ensuring safety. Overall, our research contributes to the advancement of biped robot locomotion by addressing gait optimisation from multiple perspectives, including singularity handling, safety constraints, and deviation learning.","PeriodicalId":50358,"journal":{"name":"Integrated Computer-Aided Engineering","volume":"25 1","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep deterministic policy gradient with constraints for gait optimisation of biped robots\",\"authors\":\"Xingyang Liu, Haina Rong, Ferrante Neri, Peng Yue, Gexiang Zhang\",\"doi\":\"10.3233/ica-230724\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a novel Reinforcement Learning (RL) algorithm for robotic motion control, that is, a constrained Deep Deterministic Policy Gradient (DDPG) deviation learning strategy to assist biped robots in walking safely and accurately. The previous research on this topic highlighted the limitations in the controller’s ability to accurately track foot placement on discrete terrains and the lack of consideration for safety concerns. In this study, we address these challenges by focusing on ensuring the overall system’s safety. To begin with, we tackle the inverse kinematics problem by introducing constraints to the damping least squares method. This enhancement not only addresses singularity issues but also guarantees safe ranges for joint angles, thus ensuring the stability and reliability of the system. Based on this, we propose the adoption of the constrained DDPG method to correct controller deviations. In constrained DDPG, we incorporate a constraint layer into the Actor network, incorporating joint deviations as state inputs. By conducting offline training within the range of safe angles, it serves as a deviation corrector. Lastly, we validate the effectiveness of our proposed approach by conducting dynamic simulations using the CRANE biped robot. Through comprehensive assessments, including singularity analysis, constraint effectiveness evaluation, and walking experiments on discrete terrains, we demonstrate the superiority and practicality of our approach in enhancing walking performance while ensuring safety. Overall, our research contributes to the advancement of biped robot locomotion by addressing gait optimisation from multiple perspectives, including singularity handling, safety constraints, and deviation learning.\",\"PeriodicalId\":50358,\"journal\":{\"name\":\"Integrated Computer-Aided Engineering\",\"volume\":\"25 1\",\"pages\":\"\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2023-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Integrated Computer-Aided Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.3233/ica-230724\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Integrated Computer-Aided Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ica-230724","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们提出了一种用于机器人运动控制的新型强化学习（RL）算法，即有约束的深度确定性策略梯度（DDPG）偏差学习策略，以帮助双足机器人安全、准确地行走。以往关于这一主题的研究强调了控制器在离散地形上精确跟踪脚部位置能力的局限性，以及缺乏对安全问题的考虑。在本研究中，我们通过重点确保整个系统的安全性来应对这些挑战。首先，我们通过在阻尼最小二乘法中引入约束条件来解决逆运动学问题。这一改进不仅解决了奇异性问题，还保证了关节角度的安全范围，从而确保了系统的稳定性和可靠性。在此基础上，我们提出采用约束 DDPG 方法来修正控制器偏差。在受约束 DDPG 中，我们在 Actor 网络中加入了一个约束层，将关节偏差作为状态输入。通过在安全角度范围内进行离线训练，它可作为偏差校正器。最后，我们通过使用 CRANE 双足机器人进行动态模拟，验证了我们提出的方法的有效性。通过奇异性分析、约束有效性评估和离散地形行走实验等综合评估，我们证明了我们的方法在提高行走性能、确保安全方面的优越性和实用性。总之，我们的研究从奇异性处理、安全约束和偏差学习等多个角度解决了步态优化问题，为双足机器人运动的发展做出了贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep deterministic policy gradient with constraints for gait optimisation of biped robots

In this paper, we propose a novel Reinforcement Learning (RL) algorithm for robotic motion control, that is, a constrained Deep Deterministic Policy Gradient (DDPG) deviation learning strategy to assist biped robots in walking safely and accurately. The previous research on this topic highlighted the limitations in the controller’s ability to accurately track foot placement on discrete terrains and the lack of consideration for safety concerns. In this study, we address these challenges by focusing on ensuring the overall system’s safety. To begin with, we tackle the inverse kinematics problem by introducing constraints to the damping least squares method. This enhancement not only addresses singularity issues but also guarantees safe ranges for joint angles, thus ensuring the stability and reliability of the system. Based on this, we propose the adoption of the constrained DDPG method to correct controller deviations. In constrained DDPG, we incorporate a constraint layer into the Actor network, incorporating joint deviations as state inputs. By conducting offline training within the range of safe angles, it serves as a deviation corrector. Lastly, we validate the effectiveness of our proposed approach by conducting dynamic simulations using the CRANE biped robot. Through comprehensive assessments, including singularity analysis, constraint effectiveness evaluation, and walking experiments on discrete terrains, we demonstrate the superiority and practicality of our approach in enhancing walking performance while ensuring safety. Overall, our research contributes to the advancement of biped robot locomotion by addressing gait optimisation from multiple perspectives, including singularity handling, safety constraints, and deviation learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Integrated Computer-Aided Engineering 工程技术-工程：综合

CiteScore

9.90

自引率

21.50%

发文量

审稿时长

>12 weeks

期刊介绍： Integrated Computer-Aided Engineering (ICAE) was founded in 1993. "Based on the premise that interdisciplinary thinking and synergistic collaboration of disciplines can solve complex problems, open new frontiers, and lead to true innovations and breakthroughs, the cornerstone of industrial competitiveness and advancement of the society" as noted in the inaugural issue of the journal. The focus of ICAE is the integration of leading edge and emerging computer and information technologies for innovative solution of engineering problems. The journal fosters interdisciplinary research and presents a unique forum for innovative computer-aided engineering. It also publishes novel industrial applications of CAE, thus helping to bring new computational paradigms from research labs and classrooms to reality. Areas covered by the journal include (but are not limited to) artificial intelligence, advanced signal processing, biologically inspired computing, cognitive modeling, concurrent engineering, database management, distributed computing, evolutionary computing, fuzzy logic, genetic algorithms, geometric modeling, intelligent and adaptive systems, internet-based technologies, knowledge discovery and engineering, machine learning, mechatronics, mobile computing, multimedia technologies, networking, neural network computing, object-oriented systems, optimization and search, parallel processing, robotics virtual reality, and visualization techniques.