Attacking cooperative multi-agent reinforcement learning by adversarial minority influence

IF 6.3 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-06-21 DOI:10.1016/j.neunet.2025.107747

Simin Li , Jun Guo , Jingqiao Xiu , Yuwei Zheng , Pu Feng , Xin Yu , Jiakai Wang , Aishan Liu , Yaodong Yang , Bo An , Wenjun Wu , Xianglong Liu

{"title":"Attacking cooperative multi-agent reinforcement learning by adversarial minority influence","authors":"Simin Li , Jun Guo , Jingqiao Xiu , Yuwei Zheng , Pu Feng , Xin Yu , Jiakai Wang , Aishan Liu , Yaodong Yang , Bo An , Wenjun Wu , Xianglong Liu","doi":"10.1016/j.neunet.2025.107747","DOIUrl":null,"url":null,"abstract":"<div><div>This study probes the vulnerabilities of cooperative multi-agent reinforcement learning (c-MARL) under adversarial attacks, a critical determinant of c-MARL’s worst-case performance prior to real-world implementation. Current observation-based attacks, constrained by white-box assumptions, overlook c-MARL’s complex <em>multi-agent</em> interactions and <em>cooperative</em> objectives, resulting in impractical and limited attack capabilities. To address these shortcomes, we propose <em>Adversarial Minority Influence</em> (AMI), a practical and strong for c-MARL. AMI is a practical black-box attack and can be launched without knowing victim parameters. AMI is also strong by considering the complex <em>multi-agent</em> interaction and the <em>cooperative</em> goal of agents, enabling a single adversarial agent to <em>unilaterally</em> misleads majority victims to form <em>targeted</em> worst-case cooperation. This mirrors minority influence phenomena in social psychology. To achieve maximum deviation in victim policies under complex agent-wise interactions, our <em>unilateral</em> attack aims to characterize and maximize the impact of the adversary on the victims. This is achieved by adapting a unilateral agent-wise relation metric derived from mutual information, thereby mitigating the adverse effects of victim influence on the adversary. To lead the victims into a jointly detrimental scenario, our <em>targeted</em> attack deceives victims into a long-term, cooperatively harmful situation by guiding each victim towards a specific target, determined through a trial-and-error process executed by a reinforcement learning agent. Through AMI, we achieve the first successful attack against real-world robot swarms and effectively fool agents in simulated environments into collectively worst-case scenarios, including Starcraft II and Multi-agent Mujoco. The source code and demonstrations can be found at: <span><span>https://github.com/DIG-Beihang/AMI</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107747"},"PeriodicalIF":6.3000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025006276","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

This study probes the vulnerabilities of cooperative multi-agent reinforcement learning (c-MARL) under adversarial attacks, a critical determinant of c-MARL’s worst-case performance prior to real-world implementation. Current observation-based attacks, constrained by white-box assumptions, overlook c-MARL’s complex multi-agent interactions and cooperative objectives, resulting in impractical and limited attack capabilities. To address these shortcomes, we propose Adversarial Minority Influence (AMI), a practical and strong for c-MARL. AMI is a practical black-box attack and can be launched without knowing victim parameters. AMI is also strong by considering the complex multi-agent interaction and the cooperative goal of agents, enabling a single adversarial agent to unilaterally misleads majority victims to form targeted worst-case cooperation. This mirrors minority influence phenomena in social psychology. To achieve maximum deviation in victim policies under complex agent-wise interactions, our unilateral attack aims to characterize and maximize the impact of the adversary on the victims. This is achieved by adapting a unilateral agent-wise relation metric derived from mutual information, thereby mitigating the adverse effects of victim influence on the adversary. To lead the victims into a jointly detrimental scenario, our targeted attack deceives victims into a long-term, cooperatively harmful situation by guiding each victim towards a specific target, determined through a trial-and-error process executed by a reinforcement learning agent. Through AMI, we achieve the first successful attack against real-world robot swarms and effectively fool agents in simulated environments into collectively worst-case scenarios, including Starcraft II and Multi-agent Mujoco. The source code and demonstrations can be found at: https://github.com/DIG-Beihang/AMI.

Abstract Image

查看原文本刊更多论文

对抗少数群体影响攻击合作多智能体强化学习

本研究探讨了合作多智能体强化学习（c-MARL）在对抗性攻击下的漏洞，这是c-MARL在现实世界实施之前最坏情况性能的关键决定因素。当前基于观察的攻击受白盒假设的约束，忽略了c-MARL复杂的多智能体交互和合作目标，导致攻击能力不切实际和有限。为了解决这些缺点，我们提出了对抗性少数影响（AMI），这是一种实用且强大的c-MARL。AMI是一种实用的黑盒攻击，可以在不知道受害者参数的情况下发动。AMI还考虑了复杂的多智能体交互和智能体的合作目标，使单个对抗智能体能够单方面误导大多数受害者，形成有针对性的最坏情况合作。这反映了社会心理学中的少数人影响现象。为了在复杂的代理智能交互下实现受害者政策的最大偏差，我们的单边攻击旨在描述并最大化对手对受害者的影响。这是通过采用基于相互信息的单边代理关系度量来实现的，从而减轻了受害者影响对对手的不利影响。为了引导受害者进入一个共同有害的场景，我们的目标攻击通过引导每个受害者走向一个特定的目标来欺骗受害者进入一个长期的、合作有害的情况，这个目标是通过一个强化学习代理执行的试错过程确定的。通过AMI，我们第一次成功地攻击了现实世界的机器人群，并有效地欺骗了模拟环境中的智能体，使其进入最坏的情况，包括星际争霸II和多智能体Mujoco。源代码和演示可以在https://github.com/DIG-Beihang/AMI上找到。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.