Shiqing Yao , Jiajun Chai , Haixin Yu , Yongzhe Chang , Tiantian Zhang , Yuanheng Zhu , Xueqian Wang
{"title":"隐性机制:连接个性预训练到多主体对抗协调。","authors":"Shiqing Yao , Jiajun Chai , Haixin Yu , Yongzhe Chang , Tiantian Zhang , Yuanheng Zhu , Xueqian Wang","doi":"10.1016/j.neunet.2025.108121","DOIUrl":null,"url":null,"abstract":"<div><div>To tackle the multi-agent adversarial coordination problem, current multi-agent reinforcement learning (MARL) algorithms primarily depend on team-based rewards to update agent policies. However, they do not fully exploit the spatial relationships and their variant trends, thereby limiting overall performance. Inspired by human tactics, we propose the concept of tacit behavior to enhance the efficiency of multi-agent reinforcement learning through the refinement of the learning process. This paper introduces a novel two-phase framework to learn <strong>P</strong>re-trained <strong>T</strong>acit <strong>B</strong>ehavior for efficient multi-agent adversarial <strong>C</strong>oordination (<strong>PTBC</strong>). The framework consists of a tacit pre-training phase and a centralized adversarial training phase. For pre-training the tacit behaviors, we develop a pattern mechanism and a tacit mechanism to integrate spatial relationships among agents, which dynamically guide agents’ actions to gain spatial advantages for coordination. In the subsequent centralized adversarial training phase, we utilize the pre-trained network to enhance the formation of advantageous spatial positioning, achieving more efficient learning performance. Our experimental results in the predator-prey and StarCraft Multi-Agent Challenge (SMAC) environments demonstrate the effectiveness of our method through comparisons with several algorithms exhibiting distinct strengths. Additionally, by visualizing the agents’ performance in adversarial tasks, we validate that incorporating inter-agent relationships enables agents with pre-trained tacit behavior to achieve more advantageous coordination. Extensive ablation studies demonstrate the critical role of tacit guidance and the general applicability of the PTBC framework.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"Article 108121"},"PeriodicalIF":6.3000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tacit mechanism: Bridging pre-training of individuality to multi-agent adversarial coordination\",\"authors\":\"Shiqing Yao , Jiajun Chai , Haixin Yu , Yongzhe Chang , Tiantian Zhang , Yuanheng Zhu , Xueqian Wang\",\"doi\":\"10.1016/j.neunet.2025.108121\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>To tackle the multi-agent adversarial coordination problem, current multi-agent reinforcement learning (MARL) algorithms primarily depend on team-based rewards to update agent policies. However, they do not fully exploit the spatial relationships and their variant trends, thereby limiting overall performance. Inspired by human tactics, we propose the concept of tacit behavior to enhance the efficiency of multi-agent reinforcement learning through the refinement of the learning process. This paper introduces a novel two-phase framework to learn <strong>P</strong>re-trained <strong>T</strong>acit <strong>B</strong>ehavior for efficient multi-agent adversarial <strong>C</strong>oordination (<strong>PTBC</strong>). The framework consists of a tacit pre-training phase and a centralized adversarial training phase. For pre-training the tacit behaviors, we develop a pattern mechanism and a tacit mechanism to integrate spatial relationships among agents, which dynamically guide agents’ actions to gain spatial advantages for coordination. In the subsequent centralized adversarial training phase, we utilize the pre-trained network to enhance the formation of advantageous spatial positioning, achieving more efficient learning performance. Our experimental results in the predator-prey and StarCraft Multi-Agent Challenge (SMAC) environments demonstrate the effectiveness of our method through comparisons with several algorithms exhibiting distinct strengths. Additionally, by visualizing the agents’ performance in adversarial tasks, we validate that incorporating inter-agent relationships enables agents with pre-trained tacit behavior to achieve more advantageous coordination. Extensive ablation studies demonstrate the critical role of tacit guidance and the general applicability of the PTBC framework.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"194 \",\"pages\":\"Article 108121\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025010019\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025010019","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Tacit mechanism: Bridging pre-training of individuality to multi-agent adversarial coordination
To tackle the multi-agent adversarial coordination problem, current multi-agent reinforcement learning (MARL) algorithms primarily depend on team-based rewards to update agent policies. However, they do not fully exploit the spatial relationships and their variant trends, thereby limiting overall performance. Inspired by human tactics, we propose the concept of tacit behavior to enhance the efficiency of multi-agent reinforcement learning through the refinement of the learning process. This paper introduces a novel two-phase framework to learn Pre-trained Tacit Behavior for efficient multi-agent adversarial Coordination (PTBC). The framework consists of a tacit pre-training phase and a centralized adversarial training phase. For pre-training the tacit behaviors, we develop a pattern mechanism and a tacit mechanism to integrate spatial relationships among agents, which dynamically guide agents’ actions to gain spatial advantages for coordination. In the subsequent centralized adversarial training phase, we utilize the pre-trained network to enhance the formation of advantageous spatial positioning, achieving more efficient learning performance. Our experimental results in the predator-prey and StarCraft Multi-Agent Challenge (SMAC) environments demonstrate the effectiveness of our method through comparisons with several algorithms exhibiting distinct strengths. Additionally, by visualizing the agents’ performance in adversarial tasks, we validate that incorporating inter-agent relationships enables agents with pre-trained tacit behavior to achieve more advantageous coordination. Extensive ablation studies demonstrate the critical role of tacit guidance and the general applicability of the PTBC framework.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.