Chengmin Zhou , Xin Lu , Jiapeng Dai , Xiaoxu Liu , Bingding Huang , Pasi Fränti
{"title":"Hybrid of representation learning and reinforcement learning for dynamic and complex robotic motion planning","authors":"Chengmin Zhou , Xin Lu , Jiapeng Dai , Xiaoxu Liu , Bingding Huang , Pasi Fränti","doi":"10.1016/j.robot.2025.105167","DOIUrl":null,"url":null,"abstract":"<div><div>Motion planning is the soul of robot decision making. Classical planning algorithms like graph search and reaction-based algorithms face challenges in cases of dense and dynamic obstacles. Deep learning algorithms generate suboptimal one-step predictions that cause many collisions. Reinforcement learning algorithms generate optimal or near-optimal time-sequential predictions. However, they suffer from slow convergence, suboptimal converged results, and unstable training. This paper introduces a hybrid algorithm for robotic motion planning: <em><u>l</u>ong short-term memory</em> (LSTM) and <u>s</u>kip connection for <u>a</u>ttention-based <u>d</u>iscrete <u>s</u>oft <u>a</u>ctor <u>c</u>ritic (LSA-DSAC). First, graph network (relational graph) and attention network (attention weight) interpret the environmental state for the learning of the discrete soft actor critic algorithm. The expressive power of attention network outperforms that of graph in our task by difference analysis of these two representation methods. However, attention based DSAC faces the problem of unstable training (vanishing gradient). Second, the skip connection method is integrated to attention based DSAC to mitigate unstable training and improve convergence speed. Third, LSTM is taken to replace the sum operator of attention weigh and eliminate unstable training by slightly sacrificing convergence speed at early-stage training. Experiments show that LSA-DSAC outperforms the state-of-the-art in training and most evaluations. Physical robots are also implemented and tested in the real world.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"194 ","pages":"Article 105167"},"PeriodicalIF":5.2000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Autonomous Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0921889025002647","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Motion planning is the soul of robot decision making. Classical planning algorithms like graph search and reaction-based algorithms face challenges in cases of dense and dynamic obstacles. Deep learning algorithms generate suboptimal one-step predictions that cause many collisions. Reinforcement learning algorithms generate optimal or near-optimal time-sequential predictions. However, they suffer from slow convergence, suboptimal converged results, and unstable training. This paper introduces a hybrid algorithm for robotic motion planning: long short-term memory (LSTM) and skip connection for attention-based discrete soft actor critic (LSA-DSAC). First, graph network (relational graph) and attention network (attention weight) interpret the environmental state for the learning of the discrete soft actor critic algorithm. The expressive power of attention network outperforms that of graph in our task by difference analysis of these two representation methods. However, attention based DSAC faces the problem of unstable training (vanishing gradient). Second, the skip connection method is integrated to attention based DSAC to mitigate unstable training and improve convergence speed. Third, LSTM is taken to replace the sum operator of attention weigh and eliminate unstable training by slightly sacrificing convergence speed at early-stage training. Experiments show that LSA-DSAC outperforms the state-of-the-art in training and most evaluations. Physical robots are also implemented and tested in the real world.
期刊介绍:
Robotics and Autonomous Systems will carry articles describing fundamental developments in the field of robotics, with special emphasis on autonomous systems. An important goal of this journal is to extend the state of the art in both symbolic and sensory based robot control and learning in the context of autonomous systems.
Robotics and Autonomous Systems will carry articles on the theoretical, computational and experimental aspects of autonomous systems, or modules of such systems.