Urban Path Planning Based on Improved Model-based Reinforcement Learning Algorithm

Huimin Wang, Dong Liang, Yuliang Xi
{"title":"Urban Path Planning Based on Improved Model-based Reinforcement Learning Algorithm","authors":"Huimin Wang, Dong Liang, Yuliang Xi","doi":"10.1145/3573834.3574534","DOIUrl":null,"url":null,"abstract":"With the development of the urban economy, and the continuous expansion of vehicle scale, traffic congestion has become the most serious problem affecting contemporary urban development. Using advanced road network information perception and transmission technologies, path planning under real-time road conditions has become an important means to solve this problem. Previously, our proposed model-based reinforcement learning multipath planning algorithm realized the rapid response of the path planning result, alleviating congestion drift to a certain extent. However, further research shows that the model performs poorly in extreme road network environments (the road network traffic pressure is 0) and cannot explore the complete path, the main reason is that the effect of model hyperparameters on the convergence of the algorithm was ignored. to solve this problems, we explore the hyperparameters in detail, especially discuss the discount factor γ and the finalReward to the model convergence by using Shenzhen road network data. the results show that when the discount factor γ and the finalReward value satisfy certain conditions, which is obtained in this study, the improved model-based method can guarantee the convergence stability of the algorithm under extreme road network environments. This paper reveals the importance of the design of hyperparameters γ and finalReward as well as their interrelationship on the convergence of reinforcement learning algorithms and we hope to give some insights in the field which explore hyperparameters of reinforcement learning algorithm.","PeriodicalId":345434,"journal":{"name":"Proceedings of the 4th International Conference on Advanced Information Science and System","volume":"177 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Conference on Advanced Information Science and System","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3573834.3574534","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With the development of the urban economy, and the continuous expansion of vehicle scale, traffic congestion has become the most serious problem affecting contemporary urban development. Using advanced road network information perception and transmission technologies, path planning under real-time road conditions has become an important means to solve this problem. Previously, our proposed model-based reinforcement learning multipath planning algorithm realized the rapid response of the path planning result, alleviating congestion drift to a certain extent. However, further research shows that the model performs poorly in extreme road network environments (the road network traffic pressure is 0) and cannot explore the complete path, the main reason is that the effect of model hyperparameters on the convergence of the algorithm was ignored. to solve this problems, we explore the hyperparameters in detail, especially discuss the discount factor γ and the finalReward to the model convergence by using Shenzhen road network data. the results show that when the discount factor γ and the finalReward value satisfy certain conditions, which is obtained in this study, the improved model-based method can guarantee the convergence stability of the algorithm under extreme road network environments. This paper reveals the importance of the design of hyperparameters γ and finalReward as well as their interrelationship on the convergence of reinforcement learning algorithms and we hope to give some insights in the field which explore hyperparameters of reinforcement learning algorithm.
基于改进模型强化学习算法的城市路径规划
随着城市经济的发展,车辆规模的不断扩大,交通拥堵已成为影响当代城市发展的最严重问题。利用先进的路网信息感知与传输技术,实时路况下的路径规划已成为解决这一问题的重要手段。此前,我们提出的基于模型的强化学习多路径规划算法实现了路径规划结果的快速响应,在一定程度上缓解了拥堵漂移。然而,进一步研究表明,该模型在极端路网环境下(路网交通压力为0)表现较差,无法探索完整路径,主要原因是忽略了模型超参数对算法收敛性的影响。为了解决这一问题,我们对超参数进行了详细的探讨,特别是以深圳道路网络数据为例,讨论了模型收敛的贴现因子γ和最终奖励。结果表明,当本研究得到的折现因子γ和finalReward值满足一定条件时,改进的基于模型的方法可以保证算法在极端路网环境下的收敛稳定性。本文揭示了超参数γ和finalReward的设计及其相互关系对强化学习算法收敛性的重要性,并希望对研究强化学习算法超参数的领域有所启发。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信