针对协作式自动驾驶车辆对抗攻击的鲁棒多智能体强化学习

IF 1.5 4区管理学 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Iet Radar Sonar and Navigation Pub Date : 2025-05-19 DOI:10.1049/rsn2.70033

Chuyao Wang, Ziwei Wang, Nabil Aouf

{"title":"针对协作式自动驾驶车辆对抗攻击的鲁棒多智能体强化学习","authors":"Chuyao Wang, Ziwei Wang, Nabil Aouf","doi":"10.1049/rsn2.70033","DOIUrl":null,"url":null,"abstract":"<p>Multi-agent deep reinforcement learning (MARL) for self-driving vehicles aims to address the complex challenge of coordinating multiple autonomous agents in shared road environments. MARL creates a more stable system and improves vehicle performance in typical traffic scenarios compared to single-agent DRL systems. However, despite its sophisticated cooperative training, MARL remains vulnerable to unforeseen adversarial attacks. Perturbed observation states can lead one or more vehicles to make critical errors in decision-making, triggering chain reactions that often result in severe collisions and accidents. To ensure the safety and reliability of multi-agent autonomous driving systems, this paper proposes a robust constrained cooperative multi-agent reinforcement learning (R-CCMARL) algorithm for self-driving vehicles, enabling robust driving policy to handle strong and unpredictable adversarial attacks. Unlike most existing works, our R-CCMARL framework employs a universal policy for each agent, achieving a more practical, nontask-oriented driving agent for real-world applications. In this way, it enables us to integrate shared observations with Mean-Field theory to model interactions within the MARL system. A risk formulation and a risk estimation network are developed to minimise the defined long-term risks. To further enhance robustness, this risk estimator is then used to construct a constrained optimisation objective function with a regulariser to maximise long-term rewards in worst-case scenarios. Experiments conducted in the CARLA simulator in intersection scenarios demonstrate that our method remains robust against adversarial state perturbations while maintaining high performance, both with and without attacks.</p>","PeriodicalId":50377,"journal":{"name":"Iet Radar Sonar and Navigation","volume":"19 1","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/rsn2.70033","citationCount":"0","resultStr":"{\"title\":\"Robust Multi-Agent Reinforcement Learning Against Adversarial Attacks for Cooperative Self-Driving Vehicles\",\"authors\":\"Chuyao Wang, Ziwei Wang, Nabil Aouf\",\"doi\":\"10.1049/rsn2.70033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Multi-agent deep reinforcement learning (MARL) for self-driving vehicles aims to address the complex challenge of coordinating multiple autonomous agents in shared road environments. MARL creates a more stable system and improves vehicle performance in typical traffic scenarios compared to single-agent DRL systems. However, despite its sophisticated cooperative training, MARL remains vulnerable to unforeseen adversarial attacks. Perturbed observation states can lead one or more vehicles to make critical errors in decision-making, triggering chain reactions that often result in severe collisions and accidents. To ensure the safety and reliability of multi-agent autonomous driving systems, this paper proposes a robust constrained cooperative multi-agent reinforcement learning (R-CCMARL) algorithm for self-driving vehicles, enabling robust driving policy to handle strong and unpredictable adversarial attacks. Unlike most existing works, our R-CCMARL framework employs a universal policy for each agent, achieving a more practical, nontask-oriented driving agent for real-world applications. In this way, it enables us to integrate shared observations with Mean-Field theory to model interactions within the MARL system. A risk formulation and a risk estimation network are developed to minimise the defined long-term risks. To further enhance robustness, this risk estimator is then used to construct a constrained optimisation objective function with a regulariser to maximise long-term rewards in worst-case scenarios. Experiments conducted in the CARLA simulator in intersection scenarios demonstrate that our method remains robust against adversarial state perturbations while maintaining high performance, both with and without attacks.</p>\",\"PeriodicalId\":50377,\"journal\":{\"name\":\"Iet Radar Sonar and Navigation\",\"volume\":\"19 1\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/rsn2.70033\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Iet Radar Sonar and Navigation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/rsn2.70033\",\"RegionNum\":4,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Iet Radar Sonar and Navigation","FirstCategoryId":"94","ListUrlMain":"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/rsn2.70033","RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

用于自动驾驶车辆的多智能体深度强化学习（MARL）旨在解决在共享道路环境中协调多个自主智能体的复杂挑战。与单代理DRL系统相比，MARL创建了一个更稳定的系统，并在典型的交通场景中提高了车辆性能。然而，尽管有复杂的合作训练，MARL仍然容易受到不可预见的对抗性攻击。受干扰的观察状态可能导致一辆或多辆汽车在决策时犯下严重错误，引发连锁反应，往往导致严重的碰撞和事故。为了保证多智能体自动驾驶系统的安全性和可靠性，本文提出了一种鲁棒约束合作多智能体强化学习（R-CCMARL）算法，使鲁棒驾驶策略能够处理强且不可预测的对抗性攻击。与大多数现有作品不同，我们的R-CCMARL框架为每个智能体采用了通用策略，为现实世界的应用实现了更实用、非任务导向的驱动智能体。通过这种方式，它使我们能够将共享观测与平均场理论相结合，以模拟MARL系统内的相互作用。开发了风险公式和风险估计网络，以尽量减少已定义的长期风险。为了进一步增强鲁棒性，然后使用该风险估计器构建约束优化目标函数，并使用正则化器在最坏情况下最大化长期回报。在交叉场景的CARLA模拟器中进行的实验表明，我们的方法对对抗状态扰动保持鲁棒性，同时在有攻击和没有攻击的情况下保持高性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Robust Multi-Agent Reinforcement Learning Against Adversarial Attacks for Cooperative Self-Driving Vehicles

查看原文本刊更多论文

Robust Multi-Agent Reinforcement Learning Against Adversarial Attacks for Cooperative Self-Driving Vehicles

Multi-agent deep reinforcement learning (MARL) for self-driving vehicles aims to address the complex challenge of coordinating multiple autonomous agents in shared road environments. MARL creates a more stable system and improves vehicle performance in typical traffic scenarios compared to single-agent DRL systems. However, despite its sophisticated cooperative training, MARL remains vulnerable to unforeseen adversarial attacks. Perturbed observation states can lead one or more vehicles to make critical errors in decision-making, triggering chain reactions that often result in severe collisions and accidents. To ensure the safety and reliability of multi-agent autonomous driving systems, this paper proposes a robust constrained cooperative multi-agent reinforcement learning (R-CCMARL) algorithm for self-driving vehicles, enabling robust driving policy to handle strong and unpredictable adversarial attacks. Unlike most existing works, our R-CCMARL framework employs a universal policy for each agent, achieving a more practical, nontask-oriented driving agent for real-world applications. In this way, it enables us to integrate shared observations with Mean-Field theory to model interactions within the MARL system. A risk formulation and a risk estimation network are developed to minimise the defined long-term risks. To further enhance robustness, this risk estimator is then used to construct a constrained optimisation objective function with a regulariser to maximise long-term rewards in worst-case scenarios. Experiments conducted in the CARLA simulator in intersection scenarios demonstrate that our method remains robust against adversarial state perturbations while maintaining high performance, both with and without attacks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Iet Radar Sonar and Navigation 工程技术-电信学

CiteScore

4.10

自引率

11.80%

发文量

137

审稿时长

3.4 months

期刊介绍： IET Radar, Sonar & Navigation covers the theory and practice of systems and signals for radar, sonar, radiolocation, navigation, and surveillance purposes, in aerospace and terrestrial applications. Examples include advances in waveform design, clutter and detection, electronic warfare, adaptive array and superresolution methods, tracking algorithms, synthetic aperture, and target recognition techniques.