Cooperative dual-actor proximal policy optimization algorithm for multi-robot complex control task

IF 8 1区 工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Jacky Baltes, Ilham Akbar, Saeed Saeedvand
{"title":"Cooperative dual-actor proximal policy optimization algorithm for multi-robot complex control task","authors":"Jacky Baltes,&nbsp;Ilham Akbar,&nbsp;Saeed Saeedvand","doi":"10.1016/j.aei.2024.102960","DOIUrl":null,"url":null,"abstract":"<div><div>This paper introduces a novel multi-agent Deep Reinforcement Learning (DRL) framework named the Cooperative Dual-Actor Proximal Policy Optimization (CDA-PPO) algorithm, designed to address complex humanoid robot cooperative learning control tasks. Effective cooperation among multiple humanoid robots, particularly in scenarios involving complex walking gait control and external disturbances in dynamic environments, is a critical challenge. This is especially pertinent for tasks requiring precise coordination and control, such as joint object transportation. In various real-life scenarios, humanoid robots might need to cooperate to carry large objects in many scenarios. This capability is crucial for logistics, manufacturing, intelligent transportation, and search-and-rescue missions applications. Humanoid robots have gained significant popularity, and their use in these cooperative tasks is becoming more common. To address this challenge, we propose CDA-PPO, which introduces a learning-based communication platform between agents and employs two distinct policy networks for each agent. This dual-policy approach enhances the robots’ ability to adapt to complex interactions and maintain stability while performing intricate tasks. We demonstrate the efficacy of CDA-PPO in a cooperative object-transportation scenario, where two humanoid robots collaborate to carry a table. The experimental results show that CDA-PPO significantly outperforms traditional methods, such as Independent PPO (IPPO), Multi-Agent PPO (MAPPO), and Multi-Agent Twin Delayed Deep Deterministic Policy Gradient (MATD3), in terms of training efficiency, stability, reward acquisition, and humanoid robot cooperative balance control with effective coordination between robots. The findings underscore the potential of CDA-PPO to advance the field of cooperative multi-agent control problems, proposing the way for future research in complex robotics applications.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"63 ","pages":"Article 102960"},"PeriodicalIF":8.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034624006116","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

This paper introduces a novel multi-agent Deep Reinforcement Learning (DRL) framework named the Cooperative Dual-Actor Proximal Policy Optimization (CDA-PPO) algorithm, designed to address complex humanoid robot cooperative learning control tasks. Effective cooperation among multiple humanoid robots, particularly in scenarios involving complex walking gait control and external disturbances in dynamic environments, is a critical challenge. This is especially pertinent for tasks requiring precise coordination and control, such as joint object transportation. In various real-life scenarios, humanoid robots might need to cooperate to carry large objects in many scenarios. This capability is crucial for logistics, manufacturing, intelligent transportation, and search-and-rescue missions applications. Humanoid robots have gained significant popularity, and their use in these cooperative tasks is becoming more common. To address this challenge, we propose CDA-PPO, which introduces a learning-based communication platform between agents and employs two distinct policy networks for each agent. This dual-policy approach enhances the robots’ ability to adapt to complex interactions and maintain stability while performing intricate tasks. We demonstrate the efficacy of CDA-PPO in a cooperative object-transportation scenario, where two humanoid robots collaborate to carry a table. The experimental results show that CDA-PPO significantly outperforms traditional methods, such as Independent PPO (IPPO), Multi-Agent PPO (MAPPO), and Multi-Agent Twin Delayed Deep Deterministic Policy Gradient (MATD3), in terms of training efficiency, stability, reward acquisition, and humanoid robot cooperative balance control with effective coordination between robots. The findings underscore the potential of CDA-PPO to advance the field of cooperative multi-agent control problems, proposing the way for future research in complex robotics applications.
求助全文
约1分钟内获得全文 求助全文
来源期刊
Advanced Engineering Informatics
Advanced Engineering Informatics 工程技术-工程:综合
CiteScore
12.40
自引率
18.20%
发文量
292
审稿时长
45 days
期刊介绍: Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信