Mau-Luen Tham, Bee-Sim Tay, Kok-Chin Khor, S. Phon-Amnuaisuk
{"title":"基于深度强化学习的公交跳停策略","authors":"Mau-Luen Tham, Bee-Sim Tay, Kok-Chin Khor, S. Phon-Amnuaisuk","doi":"10.1109/ITC-CSCC58803.2023.10212607","DOIUrl":null,"url":null,"abstract":"Stop-skipping strategy can benefit both bus operators and passengers if the control is intelligent enough to adapt to the changes in passenger demands and traffic conditions. This is possible via deep reinforcement learning (DRL), where an agent can learn the optimal policy by continuously interacting with the dynamic bus operating environment. In this paper, one express bus lane followed by one no-skip flow is treated as one episode for bus route optimization. The objective is to maximize the passenger satisfaction level while minimizing the bus operator expenditures. To this end, a reward function is formulated as a function of passenger waiting time, passenger in-vehicle time, and total bus travel time. By training an agent of a double deep Q-network (DDQN), simulation results show that the agent can intelligently skip the stations and outperform the noskip method, under different passenger distribution patterns.","PeriodicalId":220939,"journal":{"name":"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Reinforcement Learning Based Bus Stop-Skipping Strategy\",\"authors\":\"Mau-Luen Tham, Bee-Sim Tay, Kok-Chin Khor, S. Phon-Amnuaisuk\",\"doi\":\"10.1109/ITC-CSCC58803.2023.10212607\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stop-skipping strategy can benefit both bus operators and passengers if the control is intelligent enough to adapt to the changes in passenger demands and traffic conditions. This is possible via deep reinforcement learning (DRL), where an agent can learn the optimal policy by continuously interacting with the dynamic bus operating environment. In this paper, one express bus lane followed by one no-skip flow is treated as one episode for bus route optimization. The objective is to maximize the passenger satisfaction level while minimizing the bus operator expenditures. To this end, a reward function is formulated as a function of passenger waiting time, passenger in-vehicle time, and total bus travel time. By training an agent of a double deep Q-network (DDQN), simulation results show that the agent can intelligently skip the stations and outperform the noskip method, under different passenger distribution patterns.\",\"PeriodicalId\":220939,\"journal\":{\"name\":\"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITC-CSCC58803.2023.10212607\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITC-CSCC58803.2023.10212607","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep Reinforcement Learning Based Bus Stop-Skipping Strategy
Stop-skipping strategy can benefit both bus operators and passengers if the control is intelligent enough to adapt to the changes in passenger demands and traffic conditions. This is possible via deep reinforcement learning (DRL), where an agent can learn the optimal policy by continuously interacting with the dynamic bus operating environment. In this paper, one express bus lane followed by one no-skip flow is treated as one episode for bus route optimization. The objective is to maximize the passenger satisfaction level while minimizing the bus operator expenditures. To this end, a reward function is formulated as a function of passenger waiting time, passenger in-vehicle time, and total bus travel time. By training an agent of a double deep Q-network (DDQN), simulation results show that the agent can intelligently skip the stations and outperform the noskip method, under different passenger distribution patterns.