{"title":"双机械手协同轴槽装配madpg","authors":"Junying Yao, Xiaojuan Wang, Renqiang Li, Wenxiao Wang, X. Ping, Yongkui Liu","doi":"10.1109/ROBIO55434.2022.10011768","DOIUrl":null,"url":null,"abstract":"The traditional dual manipulator control systems have not only complex motion coupling problems, but also larger computational burden, and hence it is difficult to meet the requirements of intelligent assembly. In this paper, based on multi-agent reinforcement learning theory, Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm is investigated in the collaborative assembly shaft slot assembly via dual manipulator system. For the collaborative shaft slot assembly in the dual manipulator system, sparse rewards in traditional multi-agent reinforcement learning often exist because of the long sequence decision-making problem. For the above problems, this paper considers the influence of the decision-making of a single manipulator on the overall task rewards when the overall rewards of multi -agent reinforcement learning are designed. In the proposed algorithm, by calculating the difference before and after the state of each manipulator, and applying the difference as the internal state excitation to the overall task rewards, the traditional reward function of multi-agent reinforcement learning is improved. In order to verify the designed algorithm, the dual manipulator shaft slot assembly system and test scenario are established on the CoppeliaSim simulation platform. Simulation results show that the success rate of the shaft slot assembly via the improved MADDPG algorithm is about 83 % *","PeriodicalId":151112,"journal":{"name":"2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)","volume":"318 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dual manipulator collaborative shaft slot assembly via MADDPG\",\"authors\":\"Junying Yao, Xiaojuan Wang, Renqiang Li, Wenxiao Wang, X. Ping, Yongkui Liu\",\"doi\":\"10.1109/ROBIO55434.2022.10011768\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The traditional dual manipulator control systems have not only complex motion coupling problems, but also larger computational burden, and hence it is difficult to meet the requirements of intelligent assembly. In this paper, based on multi-agent reinforcement learning theory, Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm is investigated in the collaborative assembly shaft slot assembly via dual manipulator system. For the collaborative shaft slot assembly in the dual manipulator system, sparse rewards in traditional multi-agent reinforcement learning often exist because of the long sequence decision-making problem. For the above problems, this paper considers the influence of the decision-making of a single manipulator on the overall task rewards when the overall rewards of multi -agent reinforcement learning are designed. In the proposed algorithm, by calculating the difference before and after the state of each manipulator, and applying the difference as the internal state excitation to the overall task rewards, the traditional reward function of multi-agent reinforcement learning is improved. In order to verify the designed algorithm, the dual manipulator shaft slot assembly system and test scenario are established on the CoppeliaSim simulation platform. Simulation results show that the success rate of the shaft slot assembly via the improved MADDPG algorithm is about 83 % *\",\"PeriodicalId\":151112,\"journal\":{\"name\":\"2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"volume\":\"318 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROBIO55434.2022.10011768\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBIO55434.2022.10011768","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dual manipulator collaborative shaft slot assembly via MADDPG
The traditional dual manipulator control systems have not only complex motion coupling problems, but also larger computational burden, and hence it is difficult to meet the requirements of intelligent assembly. In this paper, based on multi-agent reinforcement learning theory, Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm is investigated in the collaborative assembly shaft slot assembly via dual manipulator system. For the collaborative shaft slot assembly in the dual manipulator system, sparse rewards in traditional multi-agent reinforcement learning often exist because of the long sequence decision-making problem. For the above problems, this paper considers the influence of the decision-making of a single manipulator on the overall task rewards when the overall rewards of multi -agent reinforcement learning are designed. In the proposed algorithm, by calculating the difference before and after the state of each manipulator, and applying the difference as the internal state excitation to the overall task rewards, the traditional reward function of multi-agent reinforcement learning is improved. In order to verify the designed algorithm, the dual manipulator shaft slot assembly system and test scenario are established on the CoppeliaSim simulation platform. Simulation results show that the success rate of the shaft slot assembly via the improved MADDPG algorithm is about 83 % *