基于因果通信的多智能体强化学习在混合自主出行中的拼车定价

IF 7.6 1区工程技术 Q1 TRANSPORTATION SCIENCE & TECHNOLOGY

Transportation Research Part C-Emerging Technologies Pub Date : 2025-05-17 DOI:10.1016/j.trc.2025.105164

Ningke Xie , Yong Chen , Wei Tang , Xiqun (Michael) Chen

{"title":"基于因果通信的多智能体强化学习在混合自主出行中的拼车定价","authors":"Ningke Xie , Yong Chen , Wei Tang , Xiqun (Michael) Chen","doi":"10.1016/j.trc.2025.105164","DOIUrl":null,"url":null,"abstract":"<div><div>The burgeoning self-driving technology has provided a solid impetus for the ride-sourcing market and new demand and supply management challenges. Under the context of a long-haul mixed operation of autonomous vehicles and human-driven vehicles, this paper focuses on profit-maximizing pricing for both demand and supply sides, in which the prices are differentiated by service type, time, and location. Diverging from most studies limited to centralized control for small-scale problems, we align with distributed and scalable requirements in practice and tackle the coordination challenge from a causal communication perspective. Based on the spatial supply–demand interdependencies inherent in the ride-sourcing market, operation areas are modeled as collaborative intelligent agents. The pricing problem is formulated as a decentralized partially observable Markov game augmented with neighborhood communication. Then a multi-agent reinforcement learning with causal communication method is developed to jointly optimize pricing policy and communication mechanism through end-to-end learning. The bidirectional communication mechanism is ensured to be effective and succinct by maximizing the causal effect of the communication message. Leveraging theoretical analysis, the proposed method is proven to cope with partial observability and non-stationary environments through collaborative communication. Besides, an agent-based simulator for mixed autonomy mobility is established on a real-world large-scale network, emulating the causal communication process among decentralized areas, as well as the heterogeneity, elasticity, and uncertainty of ride-sourcing demand and supply. Two representative scenarios are designed to demonstrate the dynamic evolutions of mixed autonomy mobility: (a) smaller-sized autonomous vehicles and conservative passenger acceptance (conservative stage), and (b) larger-sized autonomous vehicles and liberal passenger acceptance (liberal stage). The results highlight that incorporating the causal communication mechanism can speed up the learning process and guide informed pricing decisions. Furthermore, the proposed method gains managerial insights into proactively regulating pricing schemes for a smooth transformation into fully autonomous ride-sourcing services.</div></div>","PeriodicalId":54417,"journal":{"name":"Transportation Research Part C-Emerging Technologies","volume":"176 ","pages":"Article 105164"},"PeriodicalIF":7.6000,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-agent reinforcement learning with causal communication for ride-sourcing pricing in mixed autonomy mobility\",\"authors\":\"Ningke Xie , Yong Chen , Wei Tang , Xiqun (Michael) Chen\",\"doi\":\"10.1016/j.trc.2025.105164\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The burgeoning self-driving technology has provided a solid impetus for the ride-sourcing market and new demand and supply management challenges. Under the context of a long-haul mixed operation of autonomous vehicles and human-driven vehicles, this paper focuses on profit-maximizing pricing for both demand and supply sides, in which the prices are differentiated by service type, time, and location. Diverging from most studies limited to centralized control for small-scale problems, we align with distributed and scalable requirements in practice and tackle the coordination challenge from a causal communication perspective. Based on the spatial supply–demand interdependencies inherent in the ride-sourcing market, operation areas are modeled as collaborative intelligent agents. The pricing problem is formulated as a decentralized partially observable Markov game augmented with neighborhood communication. Then a multi-agent reinforcement learning with causal communication method is developed to jointly optimize pricing policy and communication mechanism through end-to-end learning. The bidirectional communication mechanism is ensured to be effective and succinct by maximizing the causal effect of the communication message. Leveraging theoretical analysis, the proposed method is proven to cope with partial observability and non-stationary environments through collaborative communication. Besides, an agent-based simulator for mixed autonomy mobility is established on a real-world large-scale network, emulating the causal communication process among decentralized areas, as well as the heterogeneity, elasticity, and uncertainty of ride-sourcing demand and supply. Two representative scenarios are designed to demonstrate the dynamic evolutions of mixed autonomy mobility: (a) smaller-sized autonomous vehicles and conservative passenger acceptance (conservative stage), and (b) larger-sized autonomous vehicles and liberal passenger acceptance (liberal stage). The results highlight that incorporating the causal communication mechanism can speed up the learning process and guide informed pricing decisions. Furthermore, the proposed method gains managerial insights into proactively regulating pricing schemes for a smooth transformation into fully autonomous ride-sourcing services.</div></div>\",\"PeriodicalId\":54417,\"journal\":{\"name\":\"Transportation Research Part C-Emerging Technologies\",\"volume\":\"176 \",\"pages\":\"Article 105164\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transportation Research Part C-Emerging Technologies\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0968090X25001688\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TRANSPORTATION SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part C-Emerging Technologies","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0968090X25001688","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

蓬勃发展的自动驾驶技术为网约车市场提供了坚实的动力，也为供需管理带来了新的挑战。在无人驾驶汽车和人类驾驶汽车混合运营的长途环境下，研究了需求侧和供给侧的利润最大化定价问题，其中价格根据服务类型、时间和地点进行了区分。与大多数限于集中控制小规模问题的研究不同，我们在实践中与分布式和可扩展的需求保持一致，并从因果沟通的角度解决协调挑战。基于拼车市场固有的空间供需相互依赖关系，将运营区域建模为协作智能代理。定价问题被表述为一个分散的部分可观察马尔可夫博弈，增强了邻域通信。在此基础上，提出了基于因果沟通的多智能体强化学习方法，通过端到端学习，共同优化定价策略和沟通机制。通过最大化通信信息的因果效应，保证双向通信机制的有效性和简洁性。通过理论分析，证明了该方法可以通过协作通信处理部分可观测性和非平稳环境。建立了基于agent的混合自主移动仿真系统，仿真了分散区域间的因果通信过程，以及约车需求和供给的异质性、弹性和不确定性。设计了两种具有代表性的场景来展示混合自主移动的动态演变：(a)小型自动驾驶汽车和保守的乘客接受度（保守阶段），以及(b)大型自动驾驶汽车和自由的乘客接受度（自由阶段）。结果表明，纳入因果沟通机制可以加快学习过程，并指导明智的定价决策。此外，所提出的方法获得了管理洞察力，可以主动调节定价方案，以顺利过渡到完全自主的乘车服务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-agent reinforcement learning with causal communication for ride-sourcing pricing in mixed autonomy mobility

The burgeoning self-driving technology has provided a solid impetus for the ride-sourcing market and new demand and supply management challenges. Under the context of a long-haul mixed operation of autonomous vehicles and human-driven vehicles, this paper focuses on profit-maximizing pricing for both demand and supply sides, in which the prices are differentiated by service type, time, and location. Diverging from most studies limited to centralized control for small-scale problems, we align with distributed and scalable requirements in practice and tackle the coordination challenge from a causal communication perspective. Based on the spatial supply–demand interdependencies inherent in the ride-sourcing market, operation areas are modeled as collaborative intelligent agents. The pricing problem is formulated as a decentralized partially observable Markov game augmented with neighborhood communication. Then a multi-agent reinforcement learning with causal communication method is developed to jointly optimize pricing policy and communication mechanism through end-to-end learning. The bidirectional communication mechanism is ensured to be effective and succinct by maximizing the causal effect of the communication message. Leveraging theoretical analysis, the proposed method is proven to cope with partial observability and non-stationary environments through collaborative communication. Besides, an agent-based simulator for mixed autonomy mobility is established on a real-world large-scale network, emulating the causal communication process among decentralized areas, as well as the heterogeneity, elasticity, and uncertainty of ride-sourcing demand and supply. Two representative scenarios are designed to demonstrate the dynamic evolutions of mixed autonomy mobility: (a) smaller-sized autonomous vehicles and conservative passenger acceptance (conservative stage), and (b) larger-sized autonomous vehicles and liberal passenger acceptance (liberal stage). The results highlight that incorporating the causal communication mechanism can speed up the learning process and guide informed pricing decisions. Furthermore, the proposed method gains managerial insights into proactively regulating pricing schemes for a smooth transformation into fully autonomous ride-sourcing services.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Transportation Research Part C-Emerging Technologies 工程技术-运输科技

CiteScore

15.80

自引率

12.00%

发文量

332

审稿时长

64 days

期刊介绍： Transportation Research: Part C (TR_C) is dedicated to showcasing high-quality, scholarly research that delves into the development, applications, and implications of transportation systems and emerging technologies. Our focus lies not solely on individual technologies, but rather on their broader implications for the planning, design, operation, control, maintenance, and rehabilitation of transportation systems, services, and components. In essence, the intellectual core of the journal revolves around the transportation aspect rather than the technology itself. We actively encourage the integration of quantitative methods from diverse fields such as operations research, control systems, complex networks, computer science, and artificial intelligence. Join us in exploring the intersection of transportation systems and emerging technologies to drive innovation and progress in the field.