Intelligent Traffic Control using Double Deep Q Networks for time-varying Traffic Flows

2021 8th International Conference on Signal Processing and Integrated Networks (SPIN) Pub Date : 2021-08-26 DOI:10.1109/SPIN52536.2021.9565961

Priyadharshini Shanmugasundaram, Aakash Sinha

{"title":"Intelligent Traffic Control using Double Deep Q Networks for time-varying Traffic Flows","authors":"Priyadharshini Shanmugasundaram, Aakash Sinha","doi":"10.1109/SPIN52536.2021.9565961","DOIUrl":null,"url":null,"abstract":"Reinforcement learning, a sub-field of Machine Learning has been garnering lot of research attention lately. It helps create intelligent agents that can incrementally learn optimal strategies for challenging environments by interacting with it. Such agents are best suited for solving problems like traffic congestion, which demand solutions that eater to dynamic changes in the traffic throughput. Intelligent transportation systems which use deep reinforcement learning can adapt to varying traffic demands and learn to maintain reduced congestion. In this paper, we propose a solution approach to use Double Deep Q Networks for traffic signal control of varied traffic flows in an isolated intersection. To improve the stability of our proposed method we have used target networks, delayed updates and experience replay mechanisms. We evaluate the performance of our method on different time-varying traffic flows and find that our method learns a robust and optimal strategy which reduces vehicle waiting time and queue length significantly. Our method achieved superior performance compared to traditional traffic signal control strategies. The method has been trained and evaluated through simulations of road networks created on Simulation of Urban Mobility (SUMO).","PeriodicalId":343177,"journal":{"name":"2021 8th International Conference on Signal Processing and Integrated Networks (SPIN)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 8th International Conference on Signal Processing and Integrated Networks (SPIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPIN52536.2021.9565961","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Reinforcement learning, a sub-field of Machine Learning has been garnering lot of research attention lately. It helps create intelligent agents that can incrementally learn optimal strategies for challenging environments by interacting with it. Such agents are best suited for solving problems like traffic congestion, which demand solutions that eater to dynamic changes in the traffic throughput. Intelligent transportation systems which use deep reinforcement learning can adapt to varying traffic demands and learn to maintain reduced congestion. In this paper, we propose a solution approach to use Double Deep Q Networks for traffic signal control of varied traffic flows in an isolated intersection. To improve the stability of our proposed method we have used target networks, delayed updates and experience replay mechanisms. We evaluate the performance of our method on different time-varying traffic flows and find that our method learns a robust and optimal strategy which reduces vehicle waiting time and queue length significantly. Our method achieved superior performance compared to traditional traffic signal control strategies. The method has been trained and evaluated through simulations of road networks created on Simulation of Urban Mobility (SUMO).

查看原文本刊更多论文

基于双深Q网络的时变交通流智能交通控制

强化学习是机器学习的一个子领域，最近受到了很多研究的关注。它有助于创建智能代理，这些代理可以通过与之交互，逐步学习应对挑战环境的最佳策略。此类代理最适合解决交通拥堵等问题，这些问题需要能够适应交通吞吐量动态变化的解决方案。使用深度强化学习的智能交通系统可以适应不同的交通需求，并学习保持减少拥堵。在本文中，我们提出了一种使用双深Q网络来控制孤立交叉口中不同交通流量的交通信号的解决方法。为了提高我们提出的方法的稳定性，我们使用了目标网络、延迟更新和经验重放机制。我们对不同时变交通流的性能进行了评估，发现我们的方法学习了一个鲁棒的最优策略，显著减少了车辆等待时间和队列长度。与传统的交通信号控制策略相比，该方法具有更好的性能。通过模拟城市交通(SUMO)上创建的道路网络，对该方法进行了训练和评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 8th International Conference on Signal Processing and Integrated Networks (SPIN)

自引率

0.00%

发文量