游戏中的多智能体强化学习：研究与应用。

IF 3.4 3区医学 Q1 ENGINEERING, MULTIDISCIPLINARY

Biomimetics Pub Date : 2025-06-06 DOI:10.3390/biomimetics10060375

Haiyang Li, Ping Yang, Weidong Liu, Shaoqiang Yan, Xinyi Zhang, Donglin Zhu

{"title":"游戏中的多智能体强化学习：研究与应用。","authors":"Haiyang Li, Ping Yang, Weidong Liu, Shaoqiang Yan, Xinyi Zhang, Donglin Zhu","doi":"10.3390/biomimetics10060375","DOIUrl":null,"url":null,"abstract":"Biological systems, ranging from ant colonies to neural ecosystems, exhibit remarkable self-organizing intelligence. Inspired by these phenomena, this study investigates how bio-inspired computing principles can bridge game-theoretic rationality and multi-agent adaptability. This study systematically reviews the convergence of multi-agent reinforcement learning (MARL) and game theory, elucidating the innovative potential of this integrated paradigm for collective intelligent decision-making in dynamic open environments. Building upon stochastic game and extensive-form game-theoretic frameworks, we establish a methodological taxonomy across three dimensions: value function optimization, policy gradient learning, and online search planning, thereby clarifying the evolutionary logic and innovation trajectories of algorithmic advancements. Focusing on complex smart city scenarios-including intelligent transportation coordination and UAV swarm scheduling-we identify technical breakthroughs in MARL applications for policy space modeling and distributed decision optimization. By incorporating bio-inspired optimization approaches, the investigation particularly highlights evolutionary computation mechanisms for dynamic strategy generation in search planning, alongside population-based learning paradigms for enhancing exploration efficiency in policy refinement. The findings reveal core principles governing how groups make optimal choices in complex environments while mapping the technological development pathways created by blending cross-disciplinary methods to enhance multi-agent systems.","PeriodicalId":8907,"journal":{"name":"Biomimetics","volume":"10 6","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12190516/pdf/","citationCount":"0","resultStr":"{\"title\":\"Multi-Agent Reinforcement Learning in Games: Research and Applications.\",\"authors\":\"Haiyang Li, Ping Yang, Weidong Liu, Shaoqiang Yan, Xinyi Zhang, Donglin Zhu\",\"doi\":\"10.3390/biomimetics10060375\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Biological systems, ranging from ant colonies to neural ecosystems, exhibit remarkable self-organizing intelligence. Inspired by these phenomena, this study investigates how bio-inspired computing principles can bridge game-theoretic rationality and multi-agent adaptability. This study systematically reviews the convergence of multi-agent reinforcement learning (MARL) and game theory, elucidating the innovative potential of this integrated paradigm for collective intelligent decision-making in dynamic open environments. Building upon stochastic game and extensive-form game-theoretic frameworks, we establish a methodological taxonomy across three dimensions: value function optimization, policy gradient learning, and online search planning, thereby clarifying the evolutionary logic and innovation trajectories of algorithmic advancements. Focusing on complex smart city scenarios-including intelligent transportation coordination and UAV swarm scheduling-we identify technical breakthroughs in MARL applications for policy space modeling and distributed decision optimization. By incorporating bio-inspired optimization approaches, the investigation particularly highlights evolutionary computation mechanisms for dynamic strategy generation in search planning, alongside population-based learning paradigms for enhancing exploration efficiency in policy refinement. The findings reveal core principles governing how groups make optimal choices in complex environments while mapping the technological development pathways created by blending cross-disciplinary methods to enhance multi-agent systems.\",\"PeriodicalId\":8907,\"journal\":{\"name\":\"Biomimetics\",\"volume\":\"10 6\",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12190516/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomimetics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.3390/biomimetics10060375\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomimetics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3390/biomimetics10060375","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

生物系统，从蚁群到神经生态系统，都表现出非凡的自组织智能。受这些现象的启发，本研究探讨了受生物启发的计算原理如何在博弈论理性和多智能体适应性之间建立桥梁。本研究系统地回顾了多智能体强化学习（MARL）和博弈论的收敛性，阐明了这种集成范式在动态开放环境下集体智能决策中的创新潜力。基于随机博弈论和广义博弈论框架，我们建立了价值函数优化、策略梯度学习和在线搜索规划三个维度的方法分类，从而阐明了算法进步的进化逻辑和创新轨迹。针对复杂的智慧城市场景，包括智能交通协调和无人机群调度，我们确定了MARL应用在政策空间建模和分布式决策优化方面的技术突破。通过结合生物启发的优化方法，该研究特别强调了搜索规划中动态策略生成的进化计算机制，以及提高策略优化中探索效率的基于群体的学习范式。这些发现揭示了在复杂环境中群体如何做出最佳选择的核心原则，同时描绘了通过混合跨学科方法来增强多智能体系统所创造的技术发展路径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-Agent Reinforcement Learning in Games: Research and Applications.

Biological systems, ranging from ant colonies to neural ecosystems, exhibit remarkable self-organizing intelligence. Inspired by these phenomena, this study investigates how bio-inspired computing principles can bridge game-theoretic rationality and multi-agent adaptability. This study systematically reviews the convergence of multi-agent reinforcement learning (MARL) and game theory, elucidating the innovative potential of this integrated paradigm for collective intelligent decision-making in dynamic open environments. Building upon stochastic game and extensive-form game-theoretic frameworks, we establish a methodological taxonomy across three dimensions: value function optimization, policy gradient learning, and online search planning, thereby clarifying the evolutionary logic and innovation trajectories of algorithmic advancements. Focusing on complex smart city scenarios-including intelligent transportation coordination and UAV swarm scheduling-we identify technical breakthroughs in MARL applications for policy space modeling and distributed decision optimization. By incorporating bio-inspired optimization approaches, the investigation particularly highlights evolutionary computation mechanisms for dynamic strategy generation in search planning, alongside population-based learning paradigms for enhancing exploration efficiency in policy refinement. The findings reveal core principles governing how groups make optimal choices in complex environments while mapping the technological development pathways created by blending cross-disciplinary methods to enhance multi-agent systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊