Promoting cooperation in the voluntary prisoner's dilemma game via reinforcement learning.

IF 2.7 2区数学 Q1 MATHEMATICS, APPLIED

Chaos Pub Date : 2025-04-01 DOI:10.1063/5.0267846

Yijie Huang, Yanhong Chen

引用次数: 0

Abstract

Reinforcement learning technology has been empirically demonstrated to facilitate cooperation in game models. However, traditional research has primarily focused on two-strategy frameworks (cooperation and defection), which inadequately captures the complexity of real-world scenarios. To address this limitation, we integrated Q-learning into the prisoner's dilemma game, incorporating three strategies: cooperation, defection, and going it alone. We defined each agent's state based on the number of neighboring agents opting for cooperation and included social payoff in the Q-table update process. Numerical simulations indicate that this framework significantly enhances cooperation and average payoff as the degree of social-attention increases. This phenomenon occurs because social payoff enables individuals to move beyond narrow self-interest and consider broader social benefits. Additionally, we conducted a thorough analysis of the mechanisms underlying this enhancement of cooperation.

查看原文本刊更多论文

通过强化学习促进自愿囚徒困境博弈中的合作。

强化学习技术已被实证证明可以促进博弈模型中的合作。然而，传统的研究主要集中在两种策略框架（合作和背叛）上，这不足以捕捉到现实世界场景的复杂性。为了解决这一限制，我们将Q-learning整合到囚徒困境游戏中，结合了三种策略：合作、背叛和单干。我们根据选择合作的相邻代理的数量定义每个代理的状态，并在q表更新过程中包含社会收益。数值模拟表明，随着社会关注程度的增加，该框架显著提高了合作和平均收益。之所以会出现这种现象，是因为社会回报使个人能够超越狭隘的自身利益，考虑更广泛的社会利益。此外，我们还对加强合作的机制进行了深入分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Chaos 物理-物理：数学物理

CiteScore

5.20

自引率

13.80%

发文量

448

审稿时长

2.3 months

期刊介绍： Chaos: An Interdisciplinary Journal of Nonlinear Science is a peer-reviewed journal devoted to increasing the understanding of nonlinear phenomena and describing the manifestations in a manner comprehensible to researchers from a broad spectrum of disciplines.