动态车队管理的供需感知深度强化学习

ACM Transactions on Intelligent Systems and Technology (TIST) Pub Date : 2022-01-18 DOI:10.1145/3467979

Bolong Zheng, Lingfeng Ming, Q. Hu, Zhipeng Lü, Guanfeng Liu, Xiaofang Zhou

{"title":"动态车队管理的供需感知深度强化学习","authors":"Bolong Zheng, Lingfeng Ming, Q. Hu, Zhipeng Lü, Guanfeng Liu, Xiaofang Zhou","doi":"10.1145/3467979","DOIUrl":null,"url":null,"abstract":"Online ride-hailing platforms have reduced significantly the amounts of the time that taxis are idle and that passengers spend on waiting. As a key component of these platforms, the fleet management problem can be naturally modeled as a Markov Decision Process, which enables us to use the deep reinforcement learning. However, existing studies are proposed based on simplified problem settings that fail to model the complicated supply-dynamics and restrict the performance in the real traffic environment. In this article, we propose a supply-demand-aware deep reinforcement learning algorithm for taxi dispatching, where we use a deep Q-network with action sampling policy, called AS-DQN, to learn an optimal dispatching policy. Furthermore, we utilize a dueling network architecture, called AS-DDQN, to improve the performance of AS-DQN. Extensive experiments on real-world datasets offer insight into the performance of our model and show that it is capable of outperforming the baseline approaches.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Supply-Demand-aware Deep Reinforcement Learning for Dynamic Fleet Management\",\"authors\":\"Bolong Zheng, Lingfeng Ming, Q. Hu, Zhipeng Lü, Guanfeng Liu, Xiaofang Zhou\",\"doi\":\"10.1145/3467979\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online ride-hailing platforms have reduced significantly the amounts of the time that taxis are idle and that passengers spend on waiting. As a key component of these platforms, the fleet management problem can be naturally modeled as a Markov Decision Process, which enables us to use the deep reinforcement learning. However, existing studies are proposed based on simplified problem settings that fail to model the complicated supply-dynamics and restrict the performance in the real traffic environment. In this article, we propose a supply-demand-aware deep reinforcement learning algorithm for taxi dispatching, where we use a deep Q-network with action sampling policy, called AS-DQN, to learn an optimal dispatching policy. Furthermore, we utilize a dueling network architecture, called AS-DDQN, to improve the performance of AS-DQN. Extensive experiments on real-world datasets offer insight into the performance of our model and show that it is capable of outperforming the baseline approaches.\",\"PeriodicalId\":123526,\"journal\":{\"name\":\"ACM Transactions on Intelligent Systems and Technology (TIST)\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Intelligent Systems and Technology (TIST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3467979\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology (TIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3467979","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

网约车平台大大减少了出租车闲置和乘客等待的时间。作为这些平台的关键组成部分，车队管理问题可以自然地建模为马尔可夫决策过程，这使我们能够使用深度强化学习。然而，现有的研究都是基于简化的问题设置，未能对复杂的供给动力学进行建模，限制了实际交通环境下的性能。在本文中，我们提出了一种用于出租车调度的供需感知深度强化学习算法，其中我们使用带有动作采样策略的深度q网络(称为AS-DQN)来学习最优调度策略。此外，我们利用一种称为AS-DDQN的决斗网络架构来提高AS-DQN的性能。在真实世界数据集上的大量实验提供了对我们模型性能的洞察，并表明它能够优于基线方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Supply-Demand-aware Deep Reinforcement Learning for Dynamic Fleet Management

Online ride-hailing platforms have reduced significantly the amounts of the time that taxis are idle and that passengers spend on waiting. As a key component of these platforms, the fleet management problem can be naturally modeled as a Markov Decision Process, which enables us to use the deep reinforcement learning. However, existing studies are proposed based on simplified problem settings that fail to model the complicated supply-dynamics and restrict the performance in the real traffic environment. In this article, we propose a supply-demand-aware deep reinforcement learning algorithm for taxi dispatching, where we use a deep Q-network with action sampling policy, called AS-DQN, to learn an optimal dispatching policy. Furthermore, we utilize a dueling network architecture, called AS-DDQN, to improve the performance of AS-DQN. Extensive experiments on real-world datasets offer insight into the performance of our model and show that it is capable of outperforming the baseline approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Intelligent Systems and Technology (TIST)

自引率

0.00%

发文量