结构化群体中具有互动多样性的强化学习驱动的合作动态

IF 5.6 1区数学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Chaos Solitons & Fractals Pub Date : 2025-10-01 DOI:10.1016/j.chaos.2025.117308

Tianbo An , Huizhen Zhang , Zhanshuo Zhang , Guanghui Liu , Jiayu Li , Liangyu Chen , Zhen Wang

{"title":"结构化群体中具有互动多样性的强化学习驱动的合作动态","authors":"Tianbo An , Huizhen Zhang , Zhanshuo Zhang , Guanghui Liu , Jiayu Li , Liangyu Chen , Zhen Wang","doi":"10.1016/j.chaos.2025.117308","DOIUrl":null,"url":null,"abstract":"<div><div>In reality, individuals tend to make different decisions based on differences in relationships and behaviors with their neighbors. Based on this observation, the paper explores the evolution of cooperative behavior when agents develop separated actions for each neighbor by the reinforcement learning approach. Through simulation experiments, it is shown that our model improves the cooperative level compared to results that only consider the agent’s own behavior. This is because agents tend to adopt cooperative strategies toward their neighbors while avoiding exploitation, thus promoting the steady expansion of cooperation. Notably, we find that agents do not always choose the action with the highest expected rewards. Therefore, we classify the behavior strategies of the agents into 16 types, corresponding to all possible combinations of actions selected in different states. We observe that agents adopting a specific behavior strategy tend to dominate the evolutionary process: when they choose to cooperate, they switch to defection in the next round regardless of the opponent’s action; conversely, when they defect, they switch to cooperation in the next round, again independent of the opponent’s behavior. These agents are typically distributed among others with different strategy types, playing a bridging and buffering role. By facilitating the expansion of neighboring agents, they contribute to the spread of cooperative behavior and ultimately enhance the overall level of cooperation in the population. Similar phenomena are also observed under initial specific distributions (e.g., ALLC, ALLD). Next, the hyperparameters of reinforcement learning are analyzed, and the results show that cooperation is easier to maintain and expand when agents make decisions based on past experiences and fully consider potential future rewards. We also compare this model with a control model that adopted the assumption of interactive homogeneity, and further examine the impact of different network structures on the cooperative evolution. Finally, we introduce the memory mechanism of agents as an extended analysis of the model.</div></div>","PeriodicalId":9764,"journal":{"name":"Chaos Solitons & Fractals","volume":"201 ","pages":"Article 117308"},"PeriodicalIF":5.6000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cooperation dynamics driven by reinforcement learning with interactive diversity in structured populations\",\"authors\":\"Tianbo An , Huizhen Zhang , Zhanshuo Zhang , Guanghui Liu , Jiayu Li , Liangyu Chen , Zhen Wang\",\"doi\":\"10.1016/j.chaos.2025.117308\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In reality, individuals tend to make different decisions based on differences in relationships and behaviors with their neighbors. Based on this observation, the paper explores the evolution of cooperative behavior when agents develop separated actions for each neighbor by the reinforcement learning approach. Through simulation experiments, it is shown that our model improves the cooperative level compared to results that only consider the agent’s own behavior. This is because agents tend to adopt cooperative strategies toward their neighbors while avoiding exploitation, thus promoting the steady expansion of cooperation. Notably, we find that agents do not always choose the action with the highest expected rewards. Therefore, we classify the behavior strategies of the agents into 16 types, corresponding to all possible combinations of actions selected in different states. We observe that agents adopting a specific behavior strategy tend to dominate the evolutionary process: when they choose to cooperate, they switch to defection in the next round regardless of the opponent’s action; conversely, when they defect, they switch to cooperation in the next round, again independent of the opponent’s behavior. These agents are typically distributed among others with different strategy types, playing a bridging and buffering role. By facilitating the expansion of neighboring agents, they contribute to the spread of cooperative behavior and ultimately enhance the overall level of cooperation in the population. Similar phenomena are also observed under initial specific distributions (e.g., ALLC, ALLD). Next, the hyperparameters of reinforcement learning are analyzed, and the results show that cooperation is easier to maintain and expand when agents make decisions based on past experiences and fully consider potential future rewards. We also compare this model with a control model that adopted the assumption of interactive homogeneity, and further examine the impact of different network structures on the cooperative evolution. Finally, we introduce the memory mechanism of agents as an extended analysis of the model.</div></div>\",\"PeriodicalId\":9764,\"journal\":{\"name\":\"Chaos Solitons & Fractals\",\"volume\":\"201 \",\"pages\":\"Article 117308\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chaos Solitons & Fractals\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0960077925013219\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chaos Solitons & Fractals","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0960077925013219","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

在现实中，个人倾向于根据与邻居的关系和行为的差异做出不同的决定。基于这一观察，本文通过强化学习方法探讨了智能体对每个邻居采取分离行动时合作行为的进化。通过仿真实验表明，与只考虑agent自身行为的结果相比，我们的模型提高了协作水平。这是因为agent倾向于在避免剥削的同时对邻居采取合作策略，从而促进合作的稳步扩大。值得注意的是，我们发现代理并不总是选择期望回报最高的行为。因此，我们将agent的行为策略分为16种类型，对应于在不同状态下选择的所有可能的动作组合。我们观察到，采用特定行为策略的智能体倾向于主导进化过程：当它们选择合作时，在下一轮中不管对手的行动如何，它们都会转向背叛；相反，当他们背叛时，他们在下一轮转向合作，同样独立于对手的行为。这些代理通常分布在具有不同策略类型的其他代理之间，起着桥接和缓冲作用。它们通过促进相邻agent的扩张，促进合作行为的传播，最终提高群体整体的合作水平。在初始特定分布（如ALLC、ALLD）下也观察到类似现象。其次，对强化学习的超参数进行了分析，结果表明，当智能体根据过去的经验做出决策并充分考虑潜在的未来奖励时，合作更容易维持和扩展。并将该模型与采用交互同质性假设的控制模型进行了比较，进一步考察了不同网络结构对合作进化的影响。最后，我们引入了智能体的记忆机制，作为对模型的扩展分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cooperation dynamics driven by reinforcement learning with interactive diversity in structured populations

In reality, individuals tend to make different decisions based on differences in relationships and behaviors with their neighbors. Based on this observation, the paper explores the evolution of cooperative behavior when agents develop separated actions for each neighbor by the reinforcement learning approach. Through simulation experiments, it is shown that our model improves the cooperative level compared to results that only consider the agent’s own behavior. This is because agents tend to adopt cooperative strategies toward their neighbors while avoiding exploitation, thus promoting the steady expansion of cooperation. Notably, we find that agents do not always choose the action with the highest expected rewards. Therefore, we classify the behavior strategies of the agents into 16 types, corresponding to all possible combinations of actions selected in different states. We observe that agents adopting a specific behavior strategy tend to dominate the evolutionary process: when they choose to cooperate, they switch to defection in the next round regardless of the opponent’s action; conversely, when they defect, they switch to cooperation in the next round, again independent of the opponent’s behavior. These agents are typically distributed among others with different strategy types, playing a bridging and buffering role. By facilitating the expansion of neighboring agents, they contribute to the spread of cooperative behavior and ultimately enhance the overall level of cooperation in the population. Similar phenomena are also observed under initial specific distributions (e.g., ALLC, ALLD). Next, the hyperparameters of reinforcement learning are analyzed, and the results show that cooperation is easier to maintain and expand when agents make decisions based on past experiences and fully consider potential future rewards. We also compare this model with a control model that adopted the assumption of interactive homogeneity, and further examine the impact of different network structures on the cooperative evolution. Finally, we introduce the memory mechanism of agents as an extended analysis of the model.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Chaos Solitons & Fractals 物理-数学跨学科应用

CiteScore

13.20

自引率

10.30%

发文量

1087

审稿时长

9 months

期刊介绍： Chaos, Solitons & Fractals strives to establish itself as a premier journal in the interdisciplinary realm of Nonlinear Science, Non-equilibrium, and Complex Phenomena. It welcomes submissions covering a broad spectrum of topics within this field, including dynamics, non-equilibrium processes in physics, chemistry, and geophysics, complex matter and networks, mathematical models, computational biology, applications to quantum and mesoscopic phenomena, fluctuations and random processes, self-organization, and social phenomena.