Modified Q-Learning Algorithm for Mobile Robot Real-Time Path Planning using Reduced States

Hidayat, A. Buono, K. Priandana, S. Wahjuni
{"title":"Modified Q-Learning Algorithm for Mobile Robot Real-Time Path Planning using Reduced States","authors":"Hidayat, A. Buono, K. Priandana, S. Wahjuni","doi":"10.29207/resti.v7i3.4949","DOIUrl":null,"url":null,"abstract":"Path planning is an essential algorithm in any autonomous mobile robot, including agricultural robots. One of the reinforcement learning methods that can be used for mobile robot path planning is the Q-Learning algorithm. However, the conventional Q-learning method explores all possible robot states in order to find the most optimum path. Thus, this method requires extensive computational cost especially when there are considerable grids to be computed. This study modified the original Q-Learning algorithm by removing the impassable area, so that these areas are not considered as grids to be computed. This modified Q-Learning method was simulated as path finding algorithm for autonomous mobile robot operated at the Agribusiness and Technology Park (ATP), IPB University. Two simulations were conducted to compare the original Q-Learning method and the modified Q-Learning method. The simulation results showed that the state reductions in the modified Q-Learning method can lower the computation cost to 50.71% from the computation cost of the original Q-Learning method, that is, an average computation time of 25.74s as compared to 50.75s, respectively. Both methods produce similar number of states as the robot’s optimal path, i.e. 56 states, based on the reward obtained by the robot while selecting the path. However, the modified Q-Learning algorithm is capable of finding the path to the destination point with a minimum learning rate parameter value of 0.2 when the discount factor value is 0.9.","PeriodicalId":435683,"journal":{"name":"Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29207/resti.v7i3.4949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Path planning is an essential algorithm in any autonomous mobile robot, including agricultural robots. One of the reinforcement learning methods that can be used for mobile robot path planning is the Q-Learning algorithm. However, the conventional Q-learning method explores all possible robot states in order to find the most optimum path. Thus, this method requires extensive computational cost especially when there are considerable grids to be computed. This study modified the original Q-Learning algorithm by removing the impassable area, so that these areas are not considered as grids to be computed. This modified Q-Learning method was simulated as path finding algorithm for autonomous mobile robot operated at the Agribusiness and Technology Park (ATP), IPB University. Two simulations were conducted to compare the original Q-Learning method and the modified Q-Learning method. The simulation results showed that the state reductions in the modified Q-Learning method can lower the computation cost to 50.71% from the computation cost of the original Q-Learning method, that is, an average computation time of 25.74s as compared to 50.75s, respectively. Both methods produce similar number of states as the robot’s optimal path, i.e. 56 states, based on the reward obtained by the robot while selecting the path. However, the modified Q-Learning algorithm is capable of finding the path to the destination point with a minimum learning rate parameter value of 0.2 when the discount factor value is 0.9.
基于状态简化的移动机器人实时路径规划改进q -学习算法
路径规划是包括农业机器人在内的任何自主移动机器人的基本算法。可用于移动机器人路径规划的强化学习方法之一是Q-Learning算法。然而,传统的Q-learning方法探索所有可能的机器人状态,以找到最优路径。因此,这种方法需要大量的计算成本,特别是当有相当多的网格需要计算时。本研究对原来的Q-Learning算法进行了改进,去掉了不可通过的区域,使这些区域不被视为网格进行计算。将改进的Q-Learning方法作为自主移动机器人的寻路算法进行了仿真,该算法在IPB大学农业综合企业与科技园(ATP)运行。对原Q-Learning方法和改进后的Q-Learning方法进行了两次仿真比较。仿真结果表明,改进Q-Learning方法的状态缩减可以将计算成本从原始Q-Learning方法的计算成本降低到50.71%,即平均计算时间分别为25.74s和50.75s。基于机器人在选择路径时获得的奖励,这两种方法产生的状态数与机器人的最优路径相似,即56个状态。而改进后的Q-Learning算法在折现因子值为0.9时,能够以最小学习率参数值为0.2找到到达目的点的路径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信