{"title":"Robust Multiagent Reinforcement Learning toward Coordinated\n Decision-Making of Automated Vehicles","authors":"Xiangkun He, Hao Chen, Chengqi Lv","doi":"10.4271/10-07-04-0031","DOIUrl":null,"url":null,"abstract":"Automated driving is essential for developing and deploying intelligent\n transportation systems. However, unavoidable sensor noises or perception errors\n may cause an automated vehicle to adopt suboptimal driving policies or even lead\n to catastrophic failures. Additionally, the automated driving longitudinal and\n lateral decision-making behaviors (e.g., driving speed and lane changing\n decisions) are coupled, that is, when one of them is perturbed by unknown\n external disturbances, it causes changes or even performance degradation in the\n other. The presence of both challenges significantly curtails the potential of\n automated driving. Here, to coordinate the longitudinal and lateral driving\n decisions of an automated vehicle while ensuring policy robustness against\n observational uncertainties, we propose a novel robust coordinated\n decision-making technique via robust multiagent reinforcement learning.\n Specifically, the automated driving longitudinal and lateral decisions under\n observational perturbations are modeled as a constrained robust multiagent\n Markov decision process. Meanwhile, a nonlinear constraint setting with\n Kullback–Leibler divergence is developed to keep the variation of the driving\n policy perturbed by stochastic perturbations within bounds. Additionally, a\n robust multiagent policy optimization approach is proposed to approximate the\n optimal robust coordinated driving policy. Finally, we evaluate the proposed\n robust coordinated decision-making method in three highway scenarios with\n different traffic densities. Quantitatively, in the absence of noises, the\n proposed method achieves an approximate average enhancement of 25.58% in traffic\n efficiency and 91.31% in safety compared to all baselines across the three\n scenarios. In the presence of noises, our technique improves traffic efficiency\n and safety by an approximate average of 30.81% and 81.02% compared to all\n baselines in the three scenarios, respectively. The results demonstrate that the\n proposed approach is capable of improving automated driving performance and\n ensuring policy robustness against observational uncertainties.","PeriodicalId":42978,"journal":{"name":"SAE International Journal of Vehicle Dynamics Stability and NVH","volume":"69 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SAE International Journal of Vehicle Dynamics Stability and NVH","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4271/10-07-04-0031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 4
Abstract
Automated driving is essential for developing and deploying intelligent
transportation systems. However, unavoidable sensor noises or perception errors
may cause an automated vehicle to adopt suboptimal driving policies or even lead
to catastrophic failures. Additionally, the automated driving longitudinal and
lateral decision-making behaviors (e.g., driving speed and lane changing
decisions) are coupled, that is, when one of them is perturbed by unknown
external disturbances, it causes changes or even performance degradation in the
other. The presence of both challenges significantly curtails the potential of
automated driving. Here, to coordinate the longitudinal and lateral driving
decisions of an automated vehicle while ensuring policy robustness against
observational uncertainties, we propose a novel robust coordinated
decision-making technique via robust multiagent reinforcement learning.
Specifically, the automated driving longitudinal and lateral decisions under
observational perturbations are modeled as a constrained robust multiagent
Markov decision process. Meanwhile, a nonlinear constraint setting with
Kullback–Leibler divergence is developed to keep the variation of the driving
policy perturbed by stochastic perturbations within bounds. Additionally, a
robust multiagent policy optimization approach is proposed to approximate the
optimal robust coordinated driving policy. Finally, we evaluate the proposed
robust coordinated decision-making method in three highway scenarios with
different traffic densities. Quantitatively, in the absence of noises, the
proposed method achieves an approximate average enhancement of 25.58% in traffic
efficiency and 91.31% in safety compared to all baselines across the three
scenarios. In the presence of noises, our technique improves traffic efficiency
and safety by an approximate average of 30.81% and 81.02% compared to all
baselines in the three scenarios, respectively. The results demonstrate that the
proposed approach is capable of improving automated driving performance and
ensuring policy robustness against observational uncertainties.