基于深度强化学习的公交跳停策略

2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC) Pub Date : 2023-06-25 DOI:10.1109/ITC-CSCC58803.2023.10212607

Mau-Luen Tham, Bee-Sim Tay, Kok-Chin Khor, S. Phon-Amnuaisuk

{"title":"基于深度强化学习的公交跳停策略","authors":"Mau-Luen Tham, Bee-Sim Tay, Kok-Chin Khor, S. Phon-Amnuaisuk","doi":"10.1109/ITC-CSCC58803.2023.10212607","DOIUrl":null,"url":null,"abstract":"Stop-skipping strategy can benefit both bus operators and passengers if the control is intelligent enough to adapt to the changes in passenger demands and traffic conditions. This is possible via deep reinforcement learning (DRL), where an agent can learn the optimal policy by continuously interacting with the dynamic bus operating environment. In this paper, one express bus lane followed by one no-skip flow is treated as one episode for bus route optimization. The objective is to maximize the passenger satisfaction level while minimizing the bus operator expenditures. To this end, a reward function is formulated as a function of passenger waiting time, passenger in-vehicle time, and total bus travel time. By training an agent of a double deep Q-network (DDQN), simulation results show that the agent can intelligently skip the stations and outperform the noskip method, under different passenger distribution patterns.","PeriodicalId":220939,"journal":{"name":"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Reinforcement Learning Based Bus Stop-Skipping Strategy\",\"authors\":\"Mau-Luen Tham, Bee-Sim Tay, Kok-Chin Khor, S. Phon-Amnuaisuk\",\"doi\":\"10.1109/ITC-CSCC58803.2023.10212607\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stop-skipping strategy can benefit both bus operators and passengers if the control is intelligent enough to adapt to the changes in passenger demands and traffic conditions. This is possible via deep reinforcement learning (DRL), where an agent can learn the optimal policy by continuously interacting with the dynamic bus operating environment. In this paper, one express bus lane followed by one no-skip flow is treated as one episode for bus route optimization. The objective is to maximize the passenger satisfaction level while minimizing the bus operator expenditures. To this end, a reward function is formulated as a function of passenger waiting time, passenger in-vehicle time, and total bus travel time. By training an agent of a double deep Q-network (DDQN), simulation results show that the agent can intelligently skip the stations and outperform the noskip method, under different passenger distribution patterns.\",\"PeriodicalId\":220939,\"journal\":{\"name\":\"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITC-CSCC58803.2023.10212607\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITC-CSCC58803.2023.10212607","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

如果控制足够智能，以适应乘客需求和交通状况的变化，跳停策略对巴士运营商和乘客都有好处。这可以通过深度强化学习(DRL)实现，其中智能体可以通过不断与动态公交运行环境交互来学习最佳策略。本文将一条快速公交车道后的一条无跳流作为公交线路优化的一个插曲。我们的目标是最大限度地提高乘客满意度，同时最大限度地减少公交运营商的支出。为此，将奖励函数表述为乘客等待时间、乘客车内时间和公交总行驶时间的函数。通过对双深度q网络(DDQN)智能体的训练，仿真结果表明，在不同的客流分布模式下，该智能体可以智能跳过站点，并且优于noskip方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep Reinforcement Learning Based Bus Stop-Skipping Strategy

Stop-skipping strategy can benefit both bus operators and passengers if the control is intelligent enough to adapt to the changes in passenger demands and traffic conditions. This is possible via deep reinforcement learning (DRL), where an agent can learn the optimal policy by continuously interacting with the dynamic bus operating environment. In this paper, one express bus lane followed by one no-skip flow is treated as one episode for bus route optimization. The objective is to maximize the passenger satisfaction level while minimizing the bus operator expenditures. To this end, a reward function is formulated as a function of passenger waiting time, passenger in-vehicle time, and total bus travel time. By training an agent of a double deep Q-network (DDQN), simulation results show that the agent can intelligently skip the stations and outperform the noskip method, under different passenger distribution patterns.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)

自引率

0.00%

发文量