基于对比学习的深度强化学习智能体建模

IF 5.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Wenhao Ma;Yu-Chen Chang;Jie Yang;Yu-Kai Wang;Chin-Teng Lin
{"title":"基于对比学习的深度强化学习智能体建模","authors":"Wenhao Ma;Yu-Chen Chang;Jie Yang;Yu-Kai Wang;Chin-Teng Lin","doi":"10.1109/TETCI.2025.3595684","DOIUrl":null,"url":null,"abstract":"Multi-agent systems often require agents to collaborate with or compete against other agents with diverse goals, behaviors, or strategies. Agent modeling is essential when designing adaptive policies for intelligent machine agents in multi-agent systems, as this is the means by which the controlled agent (ego agent) understands other agents' (modeled agents) behavior and extracts their meaningful policy representations. These representations can be used to enhance the ego agent's adaptive policy which is trained by reinforcement learning. However, existing agent modeling approaches typically assume the availability of local observations from modeled agents during training or a long observation trajectory for policy adaption. To remove these constrictive assumptions and improve agent modeling performance, we devised a <bold>C</b>ontrastive <bold>L</b>earning-based <bold>A</b>gent <bold>M</b>odeling (<bold>CLAM</b>) method that relies only on the local observations from the ego agent during training and execution. With these observations, CLAM is capable of generating consistent high-quality policy representations in real time right from the beginning of each episode. We evaluated the efficacy of our approach in both cooperative and competitive multi-agent environments. The experiment results demonstrate that our approach improves reinforcement learning performance by at least 28% on cooperative and competitive tasks, which exceeds the state-of-the-art.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 5","pages":"3719-3726"},"PeriodicalIF":5.3000,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Contrastive Learning-Based Agent Modeling for Deep Reinforcement Learning\",\"authors\":\"Wenhao Ma;Yu-Chen Chang;Jie Yang;Yu-Kai Wang;Chin-Teng Lin\",\"doi\":\"10.1109/TETCI.2025.3595684\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-agent systems often require agents to collaborate with or compete against other agents with diverse goals, behaviors, or strategies. Agent modeling is essential when designing adaptive policies for intelligent machine agents in multi-agent systems, as this is the means by which the controlled agent (ego agent) understands other agents' (modeled agents) behavior and extracts their meaningful policy representations. These representations can be used to enhance the ego agent's adaptive policy which is trained by reinforcement learning. However, existing agent modeling approaches typically assume the availability of local observations from modeled agents during training or a long observation trajectory for policy adaption. To remove these constrictive assumptions and improve agent modeling performance, we devised a <bold>C</b>ontrastive <bold>L</b>earning-based <bold>A</b>gent <bold>M</b>odeling (<bold>CLAM</b>) method that relies only on the local observations from the ego agent during training and execution. With these observations, CLAM is capable of generating consistent high-quality policy representations in real time right from the beginning of each episode. We evaluated the efficacy of our approach in both cooperative and competitive multi-agent environments. The experiment results demonstrate that our approach improves reinforcement learning performance by at least 28% on cooperative and competitive tasks, which exceeds the state-of-the-art.\",\"PeriodicalId\":13135,\"journal\":{\"name\":\"IEEE Transactions on Emerging Topics in Computational Intelligence\",\"volume\":\"9 5\",\"pages\":\"3719-3726\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2025-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Emerging Topics in Computational Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11123723/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11123723/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

多智能体系统通常需要智能体与具有不同目标、行为或策略的其他智能体合作或竞争。在多代理系统中为智能机器代理设计自适应策略时,代理建模是必不可少的,因为这是受控代理(自我代理)理解其他代理(建模代理)行为并提取其有意义的策略表示的手段。这些表征可以用来增强自我智能体通过强化学习训练的自适应策略。然而,现有的智能体建模方法通常假设在训练期间建模的智能体的局部观察的可用性,或者为策略适应提供长观察轨迹。为了消除这些限制性假设并提高智能体建模性能,我们设计了一种基于对比学习的智能体建模(CLAM)方法,该方法仅依赖于自我智能体在训练和执行过程中的局部观察。通过这些观察,CLAM能够从每个事件的开始实时生成一致的高质量策略表示。我们评估了我们的方法在合作和竞争多智能体环境中的有效性。实验结果表明,我们的方法在合作和竞争任务上提高了至少28%的强化学习性能,超过了最先进的水平。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Contrastive Learning-Based Agent Modeling for Deep Reinforcement Learning
Multi-agent systems often require agents to collaborate with or compete against other agents with diverse goals, behaviors, or strategies. Agent modeling is essential when designing adaptive policies for intelligent machine agents in multi-agent systems, as this is the means by which the controlled agent (ego agent) understands other agents' (modeled agents) behavior and extracts their meaningful policy representations. These representations can be used to enhance the ego agent's adaptive policy which is trained by reinforcement learning. However, existing agent modeling approaches typically assume the availability of local observations from modeled agents during training or a long observation trajectory for policy adaption. To remove these constrictive assumptions and improve agent modeling performance, we devised a Contrastive Learning-based Agent Modeling (CLAM) method that relies only on the local observations from the ego agent during training and execution. With these observations, CLAM is capable of generating consistent high-quality policy representations in real time right from the beginning of each episode. We evaluated the efficacy of our approach in both cooperative and competitive multi-agent environments. The experiment results demonstrate that our approach improves reinforcement learning performance by at least 28% on cooperative and competitive tasks, which exceeds the state-of-the-art.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
10.30
自引率
7.50%
发文量
147
期刊介绍: The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys. TETCI is an electronics only publication. TETCI publishes six issues per year. Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信