Improving Exploration in Deep Reinforcement Learning for Incomplete Information Competition Environments

IF 5.3 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Emerging Topics in Computational Intelligence Pub Date : 2025-04-14 DOI:10.1109/TETCI.2025.3555250

Jie Lin;Yuhao Ye;Shaobo Li;Hanlin Zhang;Peng Zhao

{"title":"Improving Exploration in Deep Reinforcement Learning for Incomplete Information Competition Environments","authors":"Jie Lin;Yuhao Ye;Shaobo Li;Hanlin Zhang;Peng Zhao","doi":"10.1109/TETCI.2025.3555250","DOIUrl":null,"url":null,"abstract":"The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in inefficient interactions with the environment. Many efforts have been made to design denser rewards and promote agent exploration. However, existing methods only focus on the breadth of action exploration, neglecting the rationality of action exploration in deep reinforcement learning, which leads to inefficient action exploration for agents. To address this issue, in this paper, we propose a novel curiosity-based action exploration method in incomplete information competition game environments, namely IGC, to improve both the breadth and rationality of action exploitation in multi-agent deep reinforcement learning for sparse-reward environments. Particularly, to enhance the capability of action exploration for agents, the distance reward is designed in our IGC method to increase the density of rewards in action exploration, thereby mitigating the sparse reward problem. In addition, by integrating the Intrinsic Curiosity Module (ICM) into DQN, we propose an enhanced ICM-DQN module, which enhances the breadth and rationality of subject action exploration for agents. By doing this, our IGC method can mitigate the randomness of the existing curiosity mechanism and increase the rationality of action exploration of agents, thereby enhancing the efficiency of action exploration. Finally, we evaluate the effectiveness of our IGC method on an incomplete information card game, namely Uno card game. The results demonstrate that our IGC method can achieve both better action exploration efficiency and greater winning-rate in comparison with existing methods.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 5","pages":"3665-3676"},"PeriodicalIF":5.3000,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10964687/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in inefficient interactions with the environment. Many efforts have been made to design denser rewards and promote agent exploration. However, existing methods only focus on the breadth of action exploration, neglecting the rationality of action exploration in deep reinforcement learning, which leads to inefficient action exploration for agents. To address this issue, in this paper, we propose a novel curiosity-based action exploration method in incomplete information competition game environments, namely IGC, to improve both the breadth and rationality of action exploitation in multi-agent deep reinforcement learning for sparse-reward environments. Particularly, to enhance the capability of action exploration for agents, the distance reward is designed in our IGC method to increase the density of rewards in action exploration, thereby mitigating the sparse reward problem. In addition, by integrating the Intrinsic Curiosity Module (ICM) into DQN, we propose an enhanced ICM-DQN module, which enhances the breadth and rationality of subject action exploration for agents. By doing this, our IGC method can mitigate the randomness of the existing curiosity mechanism and increase the rationality of action exploration of agents, thereby enhancing the efficiency of action exploration. Finally, we evaluate the effectiveness of our IGC method on an incomplete information card game, namely Uno card game. The results demonstrate that our IGC method can achieve both better action exploration efficiency and greater winning-rate in comparison with existing methods.

查看原文本刊更多论文

不完全信息竞争环境下深度强化学习的改进探索

稀疏奖励问题在多智能体深度强化学习中广泛存在，使智能体无法学习到最优行为，导致与环境的交互效率低下。为了设计更密集的奖励和促进智能体的探索，人们做了很多努力。然而，现有的方法只关注动作探索的广度，忽视了深度强化学习中动作探索的合理性，导致智能体的动作探索效率低下。为了解决这一问题，本文提出了一种新的基于好奇心的不完全信息竞争博弈环境下的动作探索方法，即IGC，以提高稀疏奖励环境下多智能体深度强化学习中动作开发的广度和合理性。特别地，为了增强智能体的动作探索能力，我们的IGC方法设计了距离奖励，增加了动作探索中的奖励密度，从而缓解了稀疏奖励问题。此外，我们将内在好奇心模块（Intrinsic Curiosity Module， ICM）整合到DQN中，提出了一种增强的ICM-DQN模块，增强了智能体主体行为探索的广度和合理性。通过这样做，我们的IGC方法可以减轻现有好奇心机制的随机性，增加智能体动作探索的合理性，从而提高动作探索的效率。最后，我们评估了IGC方法在不完全信息纸牌游戏，即Uno纸牌游戏上的有效性。结果表明，与现有方法相比，IGC方法具有更好的动作探索效率和更高的胜率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Emerging Topics in Computational Intelligence Mathematics-Control and Optimization

CiteScore

10.30

自引率

7.50%

发文量

147

期刊介绍： The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys. TETCI is an electronics only publication. TETCI publishes six issues per year. Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.