Alexander Chernyavskiy , Alexey Skrynnik , Aleksandr Panov
{"title":"Applying opponent and environment modelling in decentralised multi-agent reinforcement learning","authors":"Alexander Chernyavskiy , Alexey Skrynnik , Aleksandr Panov","doi":"10.1016/j.cogsys.2024.101306","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-agent reinforcement learning (MARL) has recently gained popularity and achieved much success in different kind of games such as zero-sum, cooperative or general-sum games. Nevertheless, the vast majority of modern algorithms assume information sharing during training and, hence, could not be utilised in decentralised applications as well as leverage high-dimensional scenarios and be applied to applications with general or sophisticated reward structure. Thus, due to collecting expenses and sparsity of data in real-world applications it becomes necessary to use world models to model the environment dynamics, using latent variables — i.e. use world model to generate synthetic data for training of MARL algorithms. Therefore, focusing on the paradigm of decentralised training and decentralised execution, we propose an extension to the model-based reinforcement learning approaches leveraging fully decentralised training with planning conditioned on neighbouring co-players’ latent representations. Our approach is inspired by the idea of opponent modelling. The method makes the agent learn in joint latent space without need to interact with the environment. We suggest the approach as proof of concept that decentralised model-based algorithms are able to emerge collective behaviour with limited communication during planning, and demonstrate its necessity on iterated matrix games and modified versions of StarCraft Multi-Agent Challenge (SMAC).</div></div>","PeriodicalId":55242,"journal":{"name":"Cognitive Systems Research","volume":"89 ","pages":"Article 101306"},"PeriodicalIF":2.1000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Systems Research","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389041724001001","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-agent reinforcement learning (MARL) has recently gained popularity and achieved much success in different kind of games such as zero-sum, cooperative or general-sum games. Nevertheless, the vast majority of modern algorithms assume information sharing during training and, hence, could not be utilised in decentralised applications as well as leverage high-dimensional scenarios and be applied to applications with general or sophisticated reward structure. Thus, due to collecting expenses and sparsity of data in real-world applications it becomes necessary to use world models to model the environment dynamics, using latent variables — i.e. use world model to generate synthetic data for training of MARL algorithms. Therefore, focusing on the paradigm of decentralised training and decentralised execution, we propose an extension to the model-based reinforcement learning approaches leveraging fully decentralised training with planning conditioned on neighbouring co-players’ latent representations. Our approach is inspired by the idea of opponent modelling. The method makes the agent learn in joint latent space without need to interact with the environment. We suggest the approach as proof of concept that decentralised model-based algorithms are able to emerge collective behaviour with limited communication during planning, and demonstrate its necessity on iterated matrix games and modified versions of StarCraft Multi-Agent Challenge (SMAC).
期刊介绍:
Cognitive Systems Research is dedicated to the study of human-level cognition. As such, it welcomes papers which advance the understanding, design and applications of cognitive and intelligent systems, both natural and artificial.
The journal brings together a broad community studying cognition in its many facets in vivo and in silico, across the developmental spectrum, focusing on individual capacities or on entire architectures. It aims to foster debate and integrate ideas, concepts, constructs, theories, models and techniques from across different disciplines and different perspectives on human-level cognition. The scope of interest includes the study of cognitive capacities and architectures - both brain-inspired and non-brain-inspired - and the application of cognitive systems to real-world problems as far as it offers insights relevant for the understanding of cognition.
Cognitive Systems Research therefore welcomes mature and cutting-edge research approaching cognition from a systems-oriented perspective, both theoretical and empirically-informed, in the form of original manuscripts, short communications, opinion articles, systematic reviews, and topical survey articles from the fields of Cognitive Science (including Philosophy of Cognitive Science), Artificial Intelligence/Computer Science, Cognitive Robotics, Developmental Science, Psychology, and Neuroscience and Neuromorphic Engineering. Empirical studies will be considered if they are supplemented by theoretical analyses and contributions to theory development and/or computational modelling studies.