{"title":"协作机器人的模糊Q学习交互控制器设计","authors":"Kaichen Ying, Chen chin-yin, Longxiang Wang","doi":"10.12688/cobot.17595.1","DOIUrl":null,"url":null,"abstract":"Background: In physical human-robot interaction (pHRI), admittance control is widely used. The most critical thing in admittance control is the configuration of admittance parameters, but a constant admittance value can not meet the needs of interactive indicators smoothness especially. Variable admittance control is a method to overcome this limitation by adjusting the admittance value in real time. This paper proposes a fuzzy Q-learning (FQL) variable admittance control system, which integrates the fuzzy system (FIS) and reinforcement learning method Q-learning. Methods: FIS is used to turn a continuous input state into fuzzy set and Q-learning is used to train the premise strength of fuzzy rules to get the optimal policy of variable admittance value. To verify the performance of this method, an experiment was performed using an AUBO i5 robot. Training trajectory is point-to-point (PTP) trajectory, several interaction variables before and after training by the algorithm are compared to show the validity of algorithm. Results: Experimental results show that the reward converges to a smaller value in about 25 episodes, and the reward of the last five episodes reduces by 68%. The motion trajectory after algorithm training is closer to the ideal min-jerk trajectory and the deviation and mean value of interaction force become smaller. Conclusions: The proposed FQL method can converge in a few episodes and can improve the performance of pHRI by minimizing the jerk based cost function","PeriodicalId":29807,"journal":{"name":"Cobot","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fuzzy Q-Learning interaction controller design for collaborative robot\",\"authors\":\"Kaichen Ying, Chen chin-yin, Longxiang Wang\",\"doi\":\"10.12688/cobot.17595.1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: In physical human-robot interaction (pHRI), admittance control is widely used. The most critical thing in admittance control is the configuration of admittance parameters, but a constant admittance value can not meet the needs of interactive indicators smoothness especially. Variable admittance control is a method to overcome this limitation by adjusting the admittance value in real time. This paper proposes a fuzzy Q-learning (FQL) variable admittance control system, which integrates the fuzzy system (FIS) and reinforcement learning method Q-learning. Methods: FIS is used to turn a continuous input state into fuzzy set and Q-learning is used to train the premise strength of fuzzy rules to get the optimal policy of variable admittance value. To verify the performance of this method, an experiment was performed using an AUBO i5 robot. Training trajectory is point-to-point (PTP) trajectory, several interaction variables before and after training by the algorithm are compared to show the validity of algorithm. Results: Experimental results show that the reward converges to a smaller value in about 25 episodes, and the reward of the last five episodes reduces by 68%. The motion trajectory after algorithm training is closer to the ideal min-jerk trajectory and the deviation and mean value of interaction force become smaller. Conclusions: The proposed FQL method can converge in a few episodes and can improve the performance of pHRI by minimizing the jerk based cost function\",\"PeriodicalId\":29807,\"journal\":{\"name\":\"Cobot\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cobot\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.12688/cobot.17595.1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cobot","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12688/cobot.17595.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fuzzy Q-Learning interaction controller design for collaborative robot
Background: In physical human-robot interaction (pHRI), admittance control is widely used. The most critical thing in admittance control is the configuration of admittance parameters, but a constant admittance value can not meet the needs of interactive indicators smoothness especially. Variable admittance control is a method to overcome this limitation by adjusting the admittance value in real time. This paper proposes a fuzzy Q-learning (FQL) variable admittance control system, which integrates the fuzzy system (FIS) and reinforcement learning method Q-learning. Methods: FIS is used to turn a continuous input state into fuzzy set and Q-learning is used to train the premise strength of fuzzy rules to get the optimal policy of variable admittance value. To verify the performance of this method, an experiment was performed using an AUBO i5 robot. Training trajectory is point-to-point (PTP) trajectory, several interaction variables before and after training by the algorithm are compared to show the validity of algorithm. Results: Experimental results show that the reward converges to a smaller value in about 25 episodes, and the reward of the last five episodes reduces by 68%. The motion trajectory after algorithm training is closer to the ideal min-jerk trajectory and the deviation and mean value of interaction force become smaller. Conclusions: The proposed FQL method can converge in a few episodes and can improve the performance of pHRI by minimizing the jerk based cost function
期刊介绍:
Cobot is a rapid multidisciplinary open access publishing platform for research focused on the interdisciplinary field of collaborative robots. The aim of Cobot is to enhance knowledge and share the results of the latest innovative technologies for the technicians, researchers and experts engaged in collaborative robot research. The platform will welcome submissions in all areas of scientific and technical research related to collaborative robots, and all articles will benefit from open peer review.
The scope of Cobot includes, but is not limited to:
● Intelligent robots
● Artificial intelligence
● Human-machine collaboration and integration
● Machine vision
● Intelligent sensing
● Smart materials
● Design, development and testing of collaborative robots
● Software for cobots
● Industrial applications of cobots
● Service applications of cobots
● Medical and health applications of cobots
● Educational applications of cobots
As well as research articles and case studies, Cobot accepts a variety of article types including method articles, study protocols, software tools, systematic reviews, data notes, brief reports, and opinion articles.