基于安全强化学习和协方差矩阵自适应的软体机器人自主控制

IF 8 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2025-04-22 DOI:10.1016/j.engappai.2025.110791

Shaswat Garg , Masoud Goharimanesh , Sina Sajjadi , Farrokh Janabi-Sharifi

{"title":"基于安全强化学习和协方差矩阵自适应的软体机器人自主控制","authors":"Shaswat Garg , Masoud Goharimanesh , Sina Sajjadi , Farrokh Janabi-Sharifi","doi":"10.1016/j.engappai.2025.110791","DOIUrl":null,"url":null,"abstract":"<div><div>The control of soft robots (such as continuum robots) poses significant challenges due to their coupled dynamics with significant inherent nonlinearities. Recently, model-free reinforcement learning algorithms have been proposed as an attractive alternative to model-based methods to address such a challenging control problem through unsupervised learning. However, the safety of robots is usually ignored while training such algorithms. This is particularly important for medical applications of soft robots. Also, the curse of dimensionality in soft robots makes it difficult for a reinforcement learning algorithm to develop an optimal controller. In this work, we propose a safe phasic soft actor–critic algorithm with a covariance matrix adaptation network which is then tested on different soft robots. We demonstrate that the proposed algorithm could learn an optimal policy quickly while satisfying the safety constraints. We formulated and tested our algorithm for (i) multigait soft robot; (ii) soft gripper robot; and (iii) soft robotic trunk. The proposed algorithm achieved an average of 150% higher rewards compared to other state-of-the-art algorithms. Also, adding the safety layer helped reduce the tracking error by 8 times when compared to the algorithm without a safety layer. The policy is validated in Simulation Open Framework Architecture (SOFA) simulations against other state-of-the-art algorithms in terms of tracking errors.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"153 ","pages":"Article 110791"},"PeriodicalIF":8.0000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Autonomous control of soft robots using safe reinforcement learning and covariance matrix adaptation\",\"authors\":\"Shaswat Garg , Masoud Goharimanesh , Sina Sajjadi , Farrokh Janabi-Sharifi\",\"doi\":\"10.1016/j.engappai.2025.110791\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The control of soft robots (such as continuum robots) poses significant challenges due to their coupled dynamics with significant inherent nonlinearities. Recently, model-free reinforcement learning algorithms have been proposed as an attractive alternative to model-based methods to address such a challenging control problem through unsupervised learning. However, the safety of robots is usually ignored while training such algorithms. This is particularly important for medical applications of soft robots. Also, the curse of dimensionality in soft robots makes it difficult for a reinforcement learning algorithm to develop an optimal controller. In this work, we propose a safe phasic soft actor–critic algorithm with a covariance matrix adaptation network which is then tested on different soft robots. We demonstrate that the proposed algorithm could learn an optimal policy quickly while satisfying the safety constraints. We formulated and tested our algorithm for (i) multigait soft robot; (ii) soft gripper robot; and (iii) soft robotic trunk. The proposed algorithm achieved an average of 150% higher rewards compared to other state-of-the-art algorithms. Also, adding the safety layer helped reduce the tracking error by 8 times when compared to the algorithm without a safety layer. The policy is validated in Simulation Open Framework Architecture (SOFA) simulations against other state-of-the-art algorithms in terms of tracking errors.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"153 \",\"pages\":\"Article 110791\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197625007912\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625007912","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

由于软机器人（如连续机器人）的耦合动态具有显著的固有非线性，因此其控制面临着巨大的挑战。最近，有人提出了无模型强化学习算法，作为基于模型方法的一种有吸引力的替代方法，通过无监督学习来解决这种具有挑战性的控制问题。然而，在训练此类算法时，机器人的安全性通常会被忽略。这对于软机器人的医疗应用尤为重要。此外，软机器人的维度诅咒也使得强化学习算法难以开发出最佳控制器。在这项工作中，我们提出了一种带有协方差矩阵适应网络的安全相位软演员批评算法，并在不同的软机器人上进行了测试。我们证明了所提出的算法可以在满足安全约束的同时快速学习最优策略。我们针对 (i) 多步态软机器人、(ii) 软抓手机器人和 (iii) 软机器人躯干制定并测试了我们的算法。与其他最先进的算法相比，我们提出的算法平均提高了 150% 的奖励。此外，与没有安全层的算法相比，添加安全层有助于将跟踪误差降低 8 倍。在仿真开放框架结构（SOFA）仿真中，该策略在跟踪误差方面与其他先进算法进行了对比验证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Autonomous control of soft robots using safe reinforcement learning and covariance matrix adaptation

查看原文本刊更多论文

Autonomous control of soft robots using safe reinforcement learning and covariance matrix adaptation

The control of soft robots (such as continuum robots) poses significant challenges due to their coupled dynamics with significant inherent nonlinearities. Recently, model-free reinforcement learning algorithms have been proposed as an attractive alternative to model-based methods to address such a challenging control problem through unsupervised learning. However, the safety of robots is usually ignored while training such algorithms. This is particularly important for medical applications of soft robots. Also, the curse of dimensionality in soft robots makes it difficult for a reinforcement learning algorithm to develop an optimal controller. In this work, we propose a safe phasic soft actor–critic algorithm with a covariance matrix adaptation network which is then tested on different soft robots. We demonstrate that the proposed algorithm could learn an optimal policy quickly while satisfying the safety constraints. We formulated and tested our algorithm for (i) multigait soft robot; (ii) soft gripper robot; and (iii) soft robotic trunk. The proposed algorithm achieved an average of 150% higher rewards compared to other state-of-the-art algorithms. Also, adding the safety layer helped reduce the tracking error by 8 times when compared to the algorithm without a safety layer. The policy is validated in Simulation Open Framework Architecture (SOFA) simulations against other state-of-the-art algorithms in terms of tracking errors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.