Lane merging in autonomous vehicle urban driving using reinforcement learning models

IF 6.1 Q1 GEOGRAPHY

Journal of Urban Mobility Pub Date : 2025-09-18 DOI:10.1016/j.urbmob.2025.100150

El houssine Amraouy , Ali Yahyaouy , Hamid Gualous , Hicham Chaoui , Sanaa Faquir

{"title":"Lane merging in autonomous vehicle urban driving using reinforcement learning models","authors":"El houssine Amraouy , Ali Yahyaouy , Hamid Gualous , Hicham Chaoui , Sanaa Faquir","doi":"10.1016/j.urbmob.2025.100150","DOIUrl":null,"url":null,"abstract":"<div><div>Autonomous vehicle lane merging is a critical task in urban driving, requiring precise navigation through complex and dynamic traffic environments. Challenges such as roadworks, lane reductions, merging from gas stations, low-visibility conditions, and crowded highway on-ramps demand continuous improvements in autonomous driving systems. Effective navigation in these situations, particularly at multi-lane junctions, merging onto high-speed roads, avoiding obstacles, and managing emergency vehicle lanes, requires robust decision-making that can adapt to changing road conditions. This paper compares three popular reinforcement learning (RL) algorithms—Proximal Policy Optimization (PPO), Advantage Actor-Critic (A2C), and Deep Q-Learning (DQL)—to address these challenges. Our findings show that in environments with specific decision points, DQL excels in tasks like lane reduction and obstacle avoidance due to its value-based approach. The A2C model, as an actor-critic policy, is particularly effective in dynamic environments, enabling the optimization of urban traffic control and merging at roundabouts. PPO, known for its policy optimization capabilities, offers a robust solution by balancing safety, efficiency, and adaptability, particularly in complex situations such as high-speed merging and low-visibility conditions. The simulation results confirm that DQL, A2C, and PPO collectively enhance autonomous vehicle performance by improving navigation capabilities, increasing safety, and adapting more effectively to the complexities of urban traffic environments. This work contributes valuable insights into the application of RL for real-world autonomous driving, providing a detailed comparative evaluation that supports the selection of algorithms tailored to specific driving tasks.</div></div>","PeriodicalId":100852,"journal":{"name":"Journal of Urban Mobility","volume":"8 ","pages":"Article 100150"},"PeriodicalIF":6.1000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Urban Mobility","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667091725000524","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY","Score":null,"Total":0}

引用次数: 0

Abstract

Autonomous vehicle lane merging is a critical task in urban driving, requiring precise navigation through complex and dynamic traffic environments. Challenges such as roadworks, lane reductions, merging from gas stations, low-visibility conditions, and crowded highway on-ramps demand continuous improvements in autonomous driving systems. Effective navigation in these situations, particularly at multi-lane junctions, merging onto high-speed roads, avoiding obstacles, and managing emergency vehicle lanes, requires robust decision-making that can adapt to changing road conditions. This paper compares three popular reinforcement learning (RL) algorithms—Proximal Policy Optimization (PPO), Advantage Actor-Critic (A2C), and Deep Q-Learning (DQL)—to address these challenges. Our findings show that in environments with specific decision points, DQL excels in tasks like lane reduction and obstacle avoidance due to its value-based approach. The A2C model, as an actor-critic policy, is particularly effective in dynamic environments, enabling the optimization of urban traffic control and merging at roundabouts. PPO, known for its policy optimization capabilities, offers a robust solution by balancing safety, efficiency, and adaptability, particularly in complex situations such as high-speed merging and low-visibility conditions. The simulation results confirm that DQL, A2C, and PPO collectively enhance autonomous vehicle performance by improving navigation capabilities, increasing safety, and adapting more effectively to the complexities of urban traffic environments. This work contributes valuable insights into the application of RL for real-world autonomous driving, providing a detailed comparative evaluation that supports the selection of algorithms tailored to specific driving tasks.

查看原文本刊更多论文

基于强化学习模型的自动驾驶汽车城市驾驶车道合并

自动车道合并是城市驾驶中的一项关键任务，需要在复杂动态的交通环境中进行精确导航。道路工程、车道减少、从加油站合并、低能见度条件和拥挤的高速公路匝道等挑战都要求自动驾驶系统不断改进。在这些情况下，特别是在多车道路口、并入高速公路、避开障碍物和管理紧急车辆车道时，有效的导航需要能够适应不断变化的路况的强大决策能力。本文比较了三种流行的强化学习（RL）算法——近端策略优化（PPO）、优势行为者批评家（A2C）和深度q -学习（DQL）——来解决这些挑战。我们的研究结果表明，在具有特定决策点的环境中，由于其基于值的方法，DQL在车道减少和避障等任务中表现出色。A2C模型作为一种行为者批评策略，在动态环境中特别有效，可以优化城市交通控制和环形交叉路口的合并。PPO以其策略优化功能而闻名，它通过平衡安全性、效率和适应性提供了一个强大的解决方案，特别是在高速合并和低能见度条件等复杂情况下。仿真结果证实，DQL、A2C和PPO通过改善导航能力、提高安全性以及更有效地适应城市交通环境的复杂性，共同提高了自动驾驶汽车的性能。这项工作为强化学习在现实世界自动驾驶中的应用提供了有价值的见解，提供了详细的比较评估，支持针对特定驾驶任务量身定制的算法选择。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Urban Mobility

CiteScore

2.90

自引率

0.00%

发文量