{"title":"五连杆双足机器人行走控制的Actor-critic神经网络强化学习","authors":"Y. Vaghei, A. Ghanbari, S. Noorani","doi":"10.1109/ICROM.2014.6990997","DOIUrl":null,"url":null,"abstract":"Today, researches on adaptive control have focused on bio-inspired learning techniques to deal with real-life applications. Reinforcement Learning (RL) is one of these major techniques, which has been widely used in robot control tasks recently. On the other hand, artificial neural networks are an accurate approximation tool in nonlinear robotic dynamic control tasks. In this paper, our main goal was to combine the advantages of the artificial neural networks and the RL to reduce the learning time length and enhance the control accuracy. Therefore, we have implemented one of the promising RL approaches, actor-critic RL to control the actuation torques of a planar five-link bipedal robot and retain the passive torso in the vertical position. Our control agent consists of two three-layered neural network units, known as the critic and the actor for learning prediction and learning control tasks. These units are synchronized by the temporal difference error, which implements the eligibility trace vector to assign credit or blame for the error. Moreover, since the neural networks are implemented in both of the actor and the critic sections, we have added a learning database to reduce the probability of inaccurate approximation of the nonlinear functions. Results of our presented control method reveal its perfect performance in stable walking control of the bipedal robot.","PeriodicalId":177375,"journal":{"name":"2014 Second RSI/ISM International Conference on Robotics and Mechatronics (ICRoM)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Actor-critic neural network reinforcement learning for walking control of a 5-link bipedal robot\",\"authors\":\"Y. Vaghei, A. Ghanbari, S. Noorani\",\"doi\":\"10.1109/ICROM.2014.6990997\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today, researches on adaptive control have focused on bio-inspired learning techniques to deal with real-life applications. Reinforcement Learning (RL) is one of these major techniques, which has been widely used in robot control tasks recently. On the other hand, artificial neural networks are an accurate approximation tool in nonlinear robotic dynamic control tasks. In this paper, our main goal was to combine the advantages of the artificial neural networks and the RL to reduce the learning time length and enhance the control accuracy. Therefore, we have implemented one of the promising RL approaches, actor-critic RL to control the actuation torques of a planar five-link bipedal robot and retain the passive torso in the vertical position. Our control agent consists of two three-layered neural network units, known as the critic and the actor for learning prediction and learning control tasks. These units are synchronized by the temporal difference error, which implements the eligibility trace vector to assign credit or blame for the error. Moreover, since the neural networks are implemented in both of the actor and the critic sections, we have added a learning database to reduce the probability of inaccurate approximation of the nonlinear functions. Results of our presented control method reveal its perfect performance in stable walking control of the bipedal robot.\",\"PeriodicalId\":177375,\"journal\":{\"name\":\"2014 Second RSI/ISM International Conference on Robotics and Mechatronics (ICRoM)\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 Second RSI/ISM International Conference on Robotics and Mechatronics (ICRoM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICROM.2014.6990997\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Second RSI/ISM International Conference on Robotics and Mechatronics (ICRoM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICROM.2014.6990997","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Actor-critic neural network reinforcement learning for walking control of a 5-link bipedal robot
Today, researches on adaptive control have focused on bio-inspired learning techniques to deal with real-life applications. Reinforcement Learning (RL) is one of these major techniques, which has been widely used in robot control tasks recently. On the other hand, artificial neural networks are an accurate approximation tool in nonlinear robotic dynamic control tasks. In this paper, our main goal was to combine the advantages of the artificial neural networks and the RL to reduce the learning time length and enhance the control accuracy. Therefore, we have implemented one of the promising RL approaches, actor-critic RL to control the actuation torques of a planar five-link bipedal robot and retain the passive torso in the vertical position. Our control agent consists of two three-layered neural network units, known as the critic and the actor for learning prediction and learning control tasks. These units are synchronized by the temporal difference error, which implements the eligibility trace vector to assign credit or blame for the error. Moreover, since the neural networks are implemented in both of the actor and the critic sections, we have added a learning database to reduce the probability of inaccurate approximation of the nonlinear functions. Results of our presented control method reveal its perfect performance in stable walking control of the bipedal robot.