基于强化学习的自主交叉口管理交通效率与公平性优化

IF 3.1 2区工程技术 Q2 TRANSPORTATION

Transportmetrica A-Transport Science Pub Date : 2025-01-02 DOI:10.1080/23249935.2023.2232047

Yuanyuan Wu , David Z. W. Wang , Feng Zhu

{"title":"基于强化学习的自主交叉口管理交通效率与公平性优化","authors":"Yuanyuan Wu , David Z. W. Wang , Feng Zhu","doi":"10.1080/23249935.2023.2232047","DOIUrl":null,"url":null,"abstract":"<div><div>Autonomous Intersection Management (AIM) for high-level Connected and Automated Vehicles (CAVs) has evolved from rule-based to optimisation-based policies. However, at congested major-minor intersections, optimising solely for efficiency can negatively impact vehicle fairness. This study addresses this issue by proposing a deep reinforcement learning approach that optimises both traffic efficiency and fairness for AIM. In the modelled multi-objective Markov decision process, traffic fairness is measured by the difference between the crossing order and the approaching order of CAVs, while traffic efficiency is measured by average travel time. With unknown preferences of the objectives, Bellman optimality equation is generalised to obtain the optimal policies over the space of all possible preferences during the iterative training process. The effectiveness of the proposed method is evaluated in a simulated real-world intersection and compared with three benchmark policies, including the fairest policy for AIM: first-come-first-served. The learned policies perform best in reducing overall average vehicle delay, and demonstrate outstanding performance in balancing traffic fairness and efficiency.</div></div>","PeriodicalId":48871,"journal":{"name":"Transportmetrica A-Transport Science","volume":"21 1","pages":"Pages 247-271"},"PeriodicalIF":3.1000,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Traffic efficiency and fairness optimisation for autonomous intersection management based on reinforcement learning\",\"authors\":\"Yuanyuan Wu , David Z. W. Wang , Feng Zhu\",\"doi\":\"10.1080/23249935.2023.2232047\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Autonomous Intersection Management (AIM) for high-level Connected and Automated Vehicles (CAVs) has evolved from rule-based to optimisation-based policies. However, at congested major-minor intersections, optimising solely for efficiency can negatively impact vehicle fairness. This study addresses this issue by proposing a deep reinforcement learning approach that optimises both traffic efficiency and fairness for AIM. In the modelled multi-objective Markov decision process, traffic fairness is measured by the difference between the crossing order and the approaching order of CAVs, while traffic efficiency is measured by average travel time. With unknown preferences of the objectives, Bellman optimality equation is generalised to obtain the optimal policies over the space of all possible preferences during the iterative training process. The effectiveness of the proposed method is evaluated in a simulated real-world intersection and compared with three benchmark policies, including the fairest policy for AIM: first-come-first-served. The learned policies perform best in reducing overall average vehicle delay, and demonstrate outstanding performance in balancing traffic fairness and efficiency.</div></div>\",\"PeriodicalId\":48871,\"journal\":{\"name\":\"Transportmetrica A-Transport Science\",\"volume\":\"21 1\",\"pages\":\"Pages 247-271\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-01-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transportmetrica A-Transport Science\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/org/science/article/pii/S2324993523001859\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"TRANSPORTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportmetrica A-Transport Science","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/org/science/article/pii/S2324993523001859","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TRANSPORTATION","Score":null,"Total":0}

引用次数: 0

摘要

高级互联自动驾驶汽车（cav）的自主交叉口管理（AIM）已经从基于规则发展到基于优化策略。然而，在拥挤的主次交叉路口，仅仅为了效率而优化会对车辆公平性产生负面影响。本研究通过提出一种深度强化学习方法来解决这一问题，该方法优化了AIM的交通效率和公平性。在建模的多目标马尔可夫决策过程中，交通公平性通过车辆通过顺序和接近顺序的差值来衡量，而交通效率则通过平均行驶时间来衡量。在目标偏好未知的情况下，推广Bellman最优性方程，在迭代训练过程中得到所有可能偏好空间上的最优策略。在一个模拟的现实世界交叉口中评估了该方法的有效性，并与三种基准策略进行了比较，其中包括AIM的最公平策略：先到先得。学习策略在减少总体平均车辆延误方面表现最好，在平衡交通公平和效率方面表现出色。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Traffic efficiency and fairness optimisation for autonomous intersection management based on reinforcement learning

Autonomous Intersection Management (AIM) for high-level Connected and Automated Vehicles (CAVs) has evolved from rule-based to optimisation-based policies. However, at congested major-minor intersections, optimising solely for efficiency can negatively impact vehicle fairness. This study addresses this issue by proposing a deep reinforcement learning approach that optimises both traffic efficiency and fairness for AIM. In the modelled multi-objective Markov decision process, traffic fairness is measured by the difference between the crossing order and the approaching order of CAVs, while traffic efficiency is measured by average travel time. With unknown preferences of the objectives, Bellman optimality equation is generalised to obtain the optimal policies over the space of all possible preferences during the iterative training process. The effectiveness of the proposed method is evaluated in a simulated real-world intersection and compared with three benchmark policies, including the fairest policy for AIM: first-come-first-served. The learned policies perform best in reducing overall average vehicle delay, and demonstrate outstanding performance in balancing traffic fairness and efficiency.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Transportmetrica A-Transport Science TRANSPORTATION SCIENCE & TECHNOLOGY-

CiteScore

8.10

自引率

12.10%

发文量

期刊介绍： Transportmetrica A provides a forum for original discourse in transport science. The international journal''s focus is on the scientific approach to transport research methodology and empirical analysis of moving people and goods. Papers related to all aspects of transportation are welcome. A rigorous peer review that involves editor screening and anonymous refereeing for submitted articles facilitates quality output.