Fuzzy Q-Learning interaction controller design for collaborative robot

Cobot Pub Date : 2022-11-01 DOI:10.12688/cobot.17595.1
Kaichen Ying, Chen chin-yin, Longxiang Wang
{"title":"Fuzzy Q-Learning interaction controller design for collaborative robot","authors":"Kaichen Ying, Chen chin-yin, Longxiang Wang","doi":"10.12688/cobot.17595.1","DOIUrl":null,"url":null,"abstract":"Background: In physical human-robot interaction (pHRI), admittance control is widely used. The most critical thing in admittance control is the configuration of admittance parameters, but a constant admittance value can not meet the needs of interactive indicators smoothness especially. Variable admittance control is a method to overcome this limitation by adjusting the admittance value in real time. This paper proposes a fuzzy Q-learning (FQL) variable admittance control system, which integrates the fuzzy system (FIS) and reinforcement learning method Q-learning.  Methods: FIS is used to turn a continuous input state into fuzzy set and Q-learning is used to train the premise strength of fuzzy rules to get the optimal policy of variable admittance value. To verify the performance of this method, an experiment was performed using an AUBO i5 robot. Training trajectory is point-to-point (PTP) trajectory, several interaction variables before and after training by the algorithm are compared to show the validity of algorithm. Results: Experimental results show that the reward converges to a smaller value in about 25 episodes, and the reward of the last five episodes reduces by 68%. The motion trajectory after algorithm training is closer to the ideal min-jerk trajectory and the deviation and mean value of interaction force become smaller. Conclusions: The proposed FQL method can converge in a few episodes and can improve the performance of pHRI by minimizing the jerk based cost function","PeriodicalId":29807,"journal":{"name":"Cobot","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cobot","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12688/cobot.17595.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background: In physical human-robot interaction (pHRI), admittance control is widely used. The most critical thing in admittance control is the configuration of admittance parameters, but a constant admittance value can not meet the needs of interactive indicators smoothness especially. Variable admittance control is a method to overcome this limitation by adjusting the admittance value in real time. This paper proposes a fuzzy Q-learning (FQL) variable admittance control system, which integrates the fuzzy system (FIS) and reinforcement learning method Q-learning.  Methods: FIS is used to turn a continuous input state into fuzzy set and Q-learning is used to train the premise strength of fuzzy rules to get the optimal policy of variable admittance value. To verify the performance of this method, an experiment was performed using an AUBO i5 robot. Training trajectory is point-to-point (PTP) trajectory, several interaction variables before and after training by the algorithm are compared to show the validity of algorithm. Results: Experimental results show that the reward converges to a smaller value in about 25 episodes, and the reward of the last five episodes reduces by 68%. The motion trajectory after algorithm training is closer to the ideal min-jerk trajectory and the deviation and mean value of interaction force become smaller. Conclusions: The proposed FQL method can converge in a few episodes and can improve the performance of pHRI by minimizing the jerk based cost function
协作机器人的模糊Q学习交互控制器设计
背景:在物理人机交互中,导纳控制被广泛应用。导纳控制中最关键的是导纳参数的配置,但恒定的导纳值尤其不能满足交互指标的平滑性要求。变导纳控制是通过实时调整导纳值来克服这一限制的一种方法。本文将模糊系统(FIS)与强化学习方法Q学习相结合,提出了一种模糊Q学习(FQL)变导纳控制系统。方法:利用FIS将连续输入状态转化为模糊集,利用Q学习训练模糊规则的前提强度,得到变导纳值的最优策略。为了验证该方法的性能,使用AUBO i5机器人进行了实验。训练轨迹为点对点(PTP)轨迹,并对算法训练前后的几个交互变量进行了比较,验证了算法的有效性。结果:实验结果表明,在大约25集中,奖励收敛到一个较小的值,最后5集的奖励减少了68%。算法训练后的运动轨迹更接近理想的最小加速度轨迹,相互作用力的偏差和平均值变小。结论:所提出的FQL方法可以在几次内收敛,并且可以通过最小化基于急动的成本函数来提高pHRI的性能
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Cobot
Cobot collaborative robots-
自引率
0.00%
发文量
0
期刊介绍: Cobot is a rapid multidisciplinary open access publishing platform for research focused on the interdisciplinary field of collaborative robots. The aim of Cobot is to enhance knowledge and share the results of the latest innovative technologies for the technicians, researchers and experts engaged in collaborative robot research. The platform will welcome submissions in all areas of scientific and technical research related to collaborative robots, and all articles will benefit from open peer review. The scope of Cobot includes, but is not limited to: ● Intelligent robots ● Artificial intelligence ● Human-machine collaboration and integration ● Machine vision ● Intelligent sensing ● Smart materials ● Design, development and testing of collaborative robots ● Software for cobots ● Industrial applications of cobots ● Service applications of cobots ● Medical and health applications of cobots ● Educational applications of cobots As well as research articles and case studies, Cobot accepts a variety of article types including method articles, study protocols, software tools, systematic reviews, data notes, brief reports, and opinion articles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信