Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems

Conference on Learning for Dynamics & Control Pub Date : 2022-12-14 DOI:10.48550/arXiv.2212.07313

Tobias Enders, James Harrison, M. Pavone, Maximilian Schiffer

引用次数: 5

Abstract

We consider the sequential decision-making problem of making proactive request assignment and rejection decisions for a profit-maximizing operator of an autonomous mobility on demand system. We formalize this problem as a Markov decision process and propose a novel combination of multi-agent Soft Actor-Critic and weighted bipartite matching to obtain an anticipative control policy. Thereby, we factorize the operator's otherwise intractable action space, but still obtain a globally coordinated decision. Experiments based on real-world taxi data show that our method outperforms state of the art benchmarks with respect to performance, stability, and computational tractability.

查看原文本刊更多论文

基于混合多智能体的随需移动系统深度强化学习

考虑了利润最大化的自动随需移动系统的主动请求分配和拒绝决策的顺序决策问题。我们将该问题形式化为一个马尔可夫决策过程，并提出了一种新的多智能体软行为者-批评家和加权二部匹配的组合来获得一个预期的控制策略。因此，我们分解了算子的难以处理的动作空间，但仍然得到一个全局协调的决策。基于真实出租车数据的实验表明，我们的方法在性能、稳定性和计算可追溯性方面优于最先进的基准测试。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Conference on Learning for Dynamics & Control

自引率

0.00%

发文量